DEV Community 👩‍💻👨‍💻

jtenner
jtenner

Posted on

AssemblyScript std lib: String#slice

Today's std lib method is String#slice, which is a very useful method for copying string contents into a new String. Here is the source for the function.

class String {

  slice(start: i32, end: i32 = i32.MAX_VALUE): String {
    var len = this.length;
    start = start < 0 ? max(start + len, 0) : min(start, len);
    end   = end   < 0 ? max(end   + len, 0) : min(end,   len);
    len   = end - start;
    if (len <= 0) return changetype<String>("");
    var out = __alloc(len << 1, idof<String>());
    memory.copy(out, changetype<usize>(this) + (<usize>start << 1), <usize>len << 1);
    return changetype<String>(out); // retains
  }

}

The String class is the backing class for the string type as well. (Notice how capitalization matters, but they are treated as exactly the same class.) Now let's look at each line and go over each step of the copy process.

slice(start: i32, end: i32 = i32.MAX_VALUE): String {

Notice that you cannot call str.slice() without the first parameter in AssemblyScript because it's required. Both parameters are relative zero-based indexes. So str.slice(3, 8) will return a slice of the string from index 3 (inclusive) to 8 (not inclusive.) The second parameter defaults to i32.MAX_VALUE which is 2,147,483,647. If either parameter is larger than the string length or results in a calculated index that is less than 0, the values will be clamped.

    var len = this.length;

The first statement caches the this.length value which is a computed property. This is useful for later calculation.

    start = start < 0 ? max(start + len, 0) : min(start, len);
    end   = end   < 0 ? max(end   + len, 0) : min(end,   len);

The next two statements calculate the same thing for the start and the end indicies. If the parameter is negative, then calculate the index from the end offset, otherwise, treat it is a valid index. Once this is calculated, the values are clamped between 0 and string.length.

    len   = end - start;
    if (len <= 0) return changetype<String>("");

After calculating the final length of the resulting string, if it happens to be less than or equal to 0, return an empty string. Strings of negative length cannot exist, so it defaults to an empty string. The changetype<String>("") expression is only used to suppress inline typescript errors reported in your IDE. This expression has no side effects.

    var out = __alloc(len << 1, idof<String>());

This line allocates a single Block of type String. Strings in AssemblyScript are internally represented in utf16 format and each character is usually represented with two bytes. Thus, we need to double the length of the string to calculate how many bytes are required to store the string data. The len << 1 expression is a bit shift to the left of one bit. This effectively doubles the value of the length variable. The idof<String>() method returns the compiler's String numeric internal id as a constant value, used to describe the type of heap allocation we are creating. Finally, the __alloc() function itself is responsible for allocating the memory on the heap, returning a pointer to that heap allocation. Now, we have a place to copy the data.

    memory.copy(out, changetype<usize>(this) + (<usize>start << 1), <usize>len << 1);

Since we are copying the string values from one place to another using the same format, a simple memory.copy(destination, source, count) function call is used to copy WebAssembly memory from one place to another.

The destination is the new heap allocation we just made. The source is changetype<usize>(this) plus the starting offset which is calculated with (<usize>start * 2) (or faster equivalent start << 1). The number of bytes is calculated by using len * 2 (or faster equivalent len << 1.)

    return changetype<String>(out); // retains

Once the memory is copied, we can simply convert it to a reference type and return it.

Feel free to ask lots of questions about how the internals of AssemblyScript work. There are lots of little reasons for some of the design choices that aren't documented in my blog posts that might be confusing to others.

You can view the optimized web assembly output here if you want to see what it looks like.

Hope this was fun!

Best wishes,
@jtenner

Top comments (0)

🌚 Browsing with dark mode makes you a better developer.

It's a scientific fact.