DEV Community

Eda
Eda

Posted on • Updated on • Originally published at rivea0.github.io

Custom endsWith and startsWith Functions

Originally published on February 22, 2022 at https://rivea0.github.io/blog

When working with strings, there might come a time that you might want to check if the string starts with or ends with another given string. Luckily, JavaScript and Python have their own built-in function to do the job, aptly named startsWith() & endsWith() in JavaScript, and startswith() & endswith() in Python. However, not to reinvent the wheel, but let's say we want to implement them our own way. Because, why not?

Negative Indexing

One thing that might be helpful, before we start off, is the concept of negative indexing. For example, not in all languages, but the last character of a string can be accessed with the index number -1. The second to last character will be -2, and so on. Python allows the use of negative indexes for strings (and for most iterables), and JavaScript's slice method also allows negative indexing. These will come in handy.

Python example:

name = 'David'
name[-1] # d
name[-2] # i
Enter fullscreen mode Exit fullscreen mode

We cannot access the character directly with negative indexes in JavaScript as it will return undefined, but we can use slice:

let name = 'David';
name[-1] // undefined
name.slice(-1) // d
name.slice(-2) // id
Enter fullscreen mode Exit fullscreen mode

Implementing endsWith

Now, let's check if a string ends with another given string. Now that we know that negative indexes start from the end of the string, we can try something like this:

Python example:

name = 'David'
target = 'vid'

name[-len(target):] == target # True
Enter fullscreen mode Exit fullscreen mode

JavaScript example:

let name = 'David';
let target = 'vid';

name.slice(-target.length) === target // true
Enter fullscreen mode Exit fullscreen mode

We can take a look at what we did one by one, so that it's more clear. The first thing we see is that we get target's length, which will be in our example's case, 3 (the length of 'vid'). And, with negative indexing, we started from -3rd index of our original string and just compared the two. name.slice(-target.length) will start from the -3rd index of name up to the end of the string, which will be 'vid' and voilà! — they're the same.

It is a nice, one-liner way to do it. Now let's try our hand at startsWith, which will be easier than this one.

Implementing startsWith

We'll use the same components, slicing and using the target string's length. Let's do it.

Python example:

name = 'David'
target = 'Dav'
name[:len(target)] == target # True
Enter fullscreen mode Exit fullscreen mode

JavaScript example:

let name = 'David';
let target = 'Dav';
name.slice(0, target.length) === target // true
Enter fullscreen mode Exit fullscreen mode

Slicing the original string from the start to the length of the target string, gives us the string with the same length of target. So, name.slice(0, target.length) in this case, starts from the start of the string and goes up to the 3rd index (length of 'Dav'). We only check if the two strings are the same, and that's it.

Dissecting the Implementations

We have written great one-liners, and just implemented our own way to do startsWith and endsWith. Here are the functions (let's write the function names in snake case so as not to confuse ourselves with the built-in ones):

In Python:

def starts_with(string, target):
    return string[:len(target)] == target
Enter fullscreen mode Exit fullscreen mode
def ends_with(string, target):
    return string[-len(target)] == target
Enter fullscreen mode Exit fullscreen mode

In JavaScript:

function starts_with(string, target) {
  return string.slice(0, target.length) === target;
}
Enter fullscreen mode Exit fullscreen mode
function ends_with(string, target) {
  return string.slice(-target.length) === target;
}
Enter fullscreen mode Exit fullscreen mode

These are fine, but what about implementing the same logic another way? Maybe, with another language? One that will help us think in lower-level.

My initial thought was that it would be something like this in C (spoiler: it was naive.):

#include <stdio.h>
#include <stdbool.h>
#include <string.h>

bool starts_with(char *string, char *target) {
  int target_length = strlen(target);
  for (int i = 0; i < target_length; i++) {
    if (string[i] != target[i]) {
      return false;
      }
  }
  return true;
}

bool ends_with(char *string, char *target) {
  int target_length = strlen(target);
  int starting_index = strlen(string) - target_length;
  for (int i = 0; i < target_length; i++) {
    if (string[starting_index + i] != target[i]) {
      return false;
      }
  }
  return true;
}
Enter fullscreen mode Exit fullscreen mode

However, I was corrected that this is indeed problematic.

Here is the simpler and correct versions of starts_with and ends_with:

bool starts_with(char const *string, char const *target) {
  for ( ; *target != '\0' && *target == *string; ++target, ++string );
  return *target == '\0';
}
Enter fullscreen mode Exit fullscreen mode
bool ends_with(char const *string, char const *target) {
  char const *const t0 = target;
  for ( ; *target != '\0'; ++string, ++target ) {
    if ( *string == '\0' ) return false;
  }
  for ( ; *string != '\0'; ++string );
  size_t const t_len = (size_t)(target - t0);
  return strcmp( string - t_len, t0 ) == 0;
}
Enter fullscreen mode Exit fullscreen mode

What we do in starts_with is the same idea, only that we compare each character of our original string and the target string until target ends; also handling the case if target is longer than string — in which case it would return false.

In ends_with, we first check to see if target is longer than string (in that case, we would immediately return false). Then, using the target's length (t_len), we compare the string's end of t_len characters with our target string (t0).

Here's the whole code:

#include <stdbool.h>
#include <stddef.h>
#include <stdio.h>
#include <string.h>

// Function prototypes
bool starts_with(char const *string, char const *target);
bool ends_with( char const *string, char const *target );

int main(void) {
  char const *str = "David";
  char const *target_end = "vid";
  char const *target_start = "D";

  // prints "true"
  printf("%s\n", starts_with(str, target_start) ? "true" : "false");

  // prints "true"
  printf("%s\n", ends_with(str, target_end) ? "true" : "false");
}

bool starts_with(char const *string, char const *target) {
  for ( ; *target != '\0' && *target == *string; ++target, ++string );
  return *target == '\0';
}

bool ends_with( char const *string, char const *target ) {
  char const *const t0 = target;
  for ( ; *target != '\0'; ++string, ++target ) {
    if ( *string == '\0' ) return false;
  }
  for ( ; *string != '\0'; ++string );
  size_t const t_len = (size_t)(target - t0);
  return strcmp( string - t_len, t0 ) == 0;
}
Enter fullscreen mode Exit fullscreen mode

And now, time for some introspection.

Did we reinvent the wheel? Maybe.

Was it a problem that already been solved? That's what it was.

But, have we had some fun along the way? Well, depends on you, but I certainly did.

Oldest comments (6)

Collapse
 
pauljlucas profile image
Paul J. Lucas

You should be passing char const* around; you should be using size_t, not int. Calculating strlen() in advance is more work that necessary: just iterate until you encounter the \0 byte at the end of the string (which is what strlen() does). You're also not checking for the cases where target is longer than string which means your code will likely core dump.

Here's a simpler (and correct) implementation of starts_with():

bool starts_with(char const *string, char const *target) {
  for ( ; *target != '\0' && *target == *string; ++target, ++string )
    ;
  return *target == '\0';
}
Enter fullscreen mode Exit fullscreen mode

I'll leave a simpler (and correct) implementation of ends_with() as an exercise for the reader.

Collapse
 
rivea0 profile image
Eda

Thank you for pointing these out. I only have a very basic understanding of C from a general introductory course, so I should've probably not even attempted writing about it in the article.

I guess ends_with() could also work like this:

bool ends_with(char const *string, char const *target) {
  size_t diff = strlen(string) - strlen(target);
  return strcmp(string + diff, target) == 0;
}
Enter fullscreen mode Exit fullscreen mode

But, I'm not sure how I could avoid using strlen() here, and do it in a similar way to your starts_with().

I'll update the article later on, and also would like to apologize for being quick to write a C example while still being quite the beginner, but, lesson learned. Thank you.

Collapse
 
pauljlucas profile image
Paul J. Lucas • Edited

size_t diff = strlen(string) - strlen(target);

Nope. size_t is an unsigned type. If target is longer than string, you'll end up with a very large positive number.

For ends_with(), the simplest solution does use strlen():

bool ends_with( char const *s, char const *t ) {
    size_t const s_len = strlen( s );
    size_t const t_len = strlen( t );
    return t_len <= s_len && strcmp( s + s_len - t_len, t ) == 0;
}
Enter fullscreen mode Exit fullscreen mode

However, that's slightly inefficient. If t_len is much longer than s_len, then you've wasted time traversing to the end of t. This version is more efficient:

bool ends_with( char const *s, char const *t ) {
    char const *const t0 = t;
    for ( ; *t != '\0'; ++s, ++t ) {
        if ( *s == '\0' )
            return false;
    }
    for ( ; *s != '\0'; ++s )
        ;
    size_t const t_len = (size_t)(t - t0);
    return strcmp( s - t_len, t0 ) == 0;
}
Enter fullscreen mode Exit fullscreen mode

You're determining t_len while checking to see if you've gone past the end of s: if you have, then s can't end with t and you can just stop immediately. After, you scan for the end of s. Once you find it, then finally check for a match.

Thread Thread
 
rivea0 profile image
Eda

The first example made perfect sense, but I struggle with for ( ; s[ t_len ] != '\0'; ++s ); line from the second one.

So, in the second version, with an example string being "hey" and target being "ey", t_len would be 2, if I understand it right. Since "ey" is not longer than "hey", we don't return false immediately. But, doesn't incrementing s until the length of target in the first for loop mean that s is now only at the last character "y". So, s[t_len] confused me.

Also, is s0 needed since it's not used here?

Sorry for asking noob questions, I'm surely missing a lot and confusing myself but trying to understand it, now even regret writing the example in the first place. Thank you for your time and patience.

Thread Thread
 
pauljlucas profile image
Paul J. Lucas

You are correct: s0 is not needed. (It was left over from an earlier version.) You've also found a bug. (Even "simple" code like this can be tricky!) I've edited the code with a corrected version.

Thread Thread
 
rivea0 profile image
Eda

Thank you, again, for your help. I updated the article with the correct versions.