DEV Community

Davide Santangelo
Davide Santangelo

Posted on

The Power of RLE: How to Reduce File Sizes with Simple Code with Ruby.

Run Length Encoding (RLE) is a simple lossless data compression algorithm that works by replacing consecutive occurrences of a character with that character followed by the number of times it appears consecutively. For example, the string "AAABBBCCCC" would be compressed to "A3B3C4".

RLE is commonly used in situations where the data being compressed contains long runs of identical values, such as in images, audio, or video files.

Compression

def compress(string)
  compressed = ""
  count = 1
  prev_char = string[0]

  string.chars.each_with_index do |char, index|
    if index == 0
      next
    end

    if char == prev_char
      count += 1
    else
      compressed += prev_char + count.to_s
      count = 1
      prev_char = char
    end
  end

  compressed += prev_char + count.to_s

  if compressed.length < string.length
    return compressed
  else
    return string
  end
end
Enter fullscreen mode Exit fullscreen mode
  1. The compress function takes a string as input and initializes two variables, compressed and count, to empty strings and 1, respectively. It also initializes prev_char to the first character in the input string.

  2. The function iterates over each character in the input string using the chars method. It uses the each_with_index method to get both the character and its index in the string.

  3. If the character is the first in the string (i.e. the index is 0), the function skips it and moves on to the next character.

  4. If the current character is the same as the previous character, the function increments the count variable by 1.

  5. If the current character is different from the previous character, the function appends the previous character and its count to the compressed string, resets the count variable to 1, and sets the prev_char variable to the current character.

  6. After the loop finishes, the function appends the last character and its count to the compressed string.

  7. Finally, the function checks whether the length of the compressed string is less than the length of the input string.

If it is, the function returns the compressed string. If not, it returns the original input string.

require 'minitest/autorun'

class TestRLE < Minitest::Test
  def test_compresses_simple_string
    assert_equal 'A3B3C4', compress('AAABBBCCCC')
  end

  def test_does_not_compress_short_string
    assert_equal 'ABC', compress('ABC')
  end

  def test_compresses_string_with_spaces
    assert_equal 'A2 B2 C2', compress('AA BB CC')
  end

  def test_compresses_string_with_numbers
    assert_equal 'A2B2C2D2', compress('AABBCCDD')
    assert_equal '1A2B3C', compress('ABBCCC')
  end

  def test_does_not_compress_non_ascii_characters
    assert_equal 'Α3Β3Γ3', compress('ΑΑΑΒΒΒΓΓΓ')
  end
end
Enter fullscreen mode Exit fullscreen mode

Decompression

def decompress(string)
  decompressed = ""
  i = 0

  while i < string.length do
    char = string[i]
    if char.match?(/[a-zA-Z]/)
      if i + 1 < string.length && i + 2 < string.length && string[i+1].match?(/\d/) && string[i+2].match?(/\d/)
        count = string[i+1..i+2].to_i
        i += 3
      elsif i + 1 < string.length && string[i+1].match?(/\d/)
        count = string[i+1].to_i
        i += 2
      else
        raise "Invalid input string"
      end
      decompressed += char * count
    else
      raise "Invalid input string"
    end
  end

  decompressed
end

Enter fullscreen mode Exit fullscreen mode
  1. The decompress function takes a compressed string as input and initializes two variables, decompressed and i, to empty strings and 0, respectively.

  2. The function enters a while loop that continues as long as i is less than the length of the input string.

  3. Inside the loop, the function gets the current character at index i and checks whether it is a letter using a regular expression. If it is not a letter, the function raises an error.

  4. If the current character is a letter, the function gets the next two characters as a substring starting at index i+1, and converts it to an integer using the to_i method.

  5. The function appends char repeated count times to the decompressed string.

  6. The function increments i by 3, since each letter in the compressed string is followed by two digits representing its count.

  7. After the loop finishes, the function returns the decompressed string.

require 'minitest/autorun'

class TestRLE < Minitest::Test
  # ...

  def test_decompresses_simple_string
    assert_equal 'AAABBBCCCC', decompress('A3B3C4')
  end

  def test_does_not_decompress_non_ascii_characters
    assert_raises(RuntimeError) { decompress('Α3Β3Γ3') }
  end

  def test_raises_error_with_invalid_input_string
    assert_raises(RuntimeError) { decompress('A3B3C4D5') }
  end
end
Enter fullscreen mode Exit fullscreen mode

Conclusion

In conclusion, the Run-Length Encoding (RLE) algorithm is a simple and effective method for compressing data that contains long runs of identical characters. It works by replacing each run with a count of the number of consecutive characters, followed by the character itself.

RLE can be easily implemented in any programming language and is particularly useful for compressing data with a high degree of repetition, such as text files, images, and videos. However, RLE may not be effective for compressing data with a low degree of repetition or complex patterns.

To use RLE, we first compress the data using a function that takes the input string as an argument and returns the compressed string. Then, to decompress the data, we use a separate function that takes the compressed string as an argument and returns the original uncompressed string.

When implementing RLE, it's important to handle edge cases, such as strings with special characters or invalid input strings, and to test the functions thoroughly to ensure that they work correctly in all scenarios.

Overall, RLE is a useful and widely used algorithm for data compression, and understanding how it works can help developers create more efficient and optimized applications.

Top comments (0)