## DEV Community

Banji

Posted on • Updated on

# Python | Hamming Problem

██████╗░███╗░░██╗░█████╗░
██╔══██╗████╗░██║██╔══██╗
██║░░██║██╔██╗██║███████║
██║░░██║██║╚████║██╔══██║
██████╔╝██║░╚███║██║░░██║
╚═════╝░╚═╝░░╚══╝╚═╝░░╚═╝

Hey, everyone.
In this post I'm going to tell you about Hamming problem (Simple) and my solution for it.
If you are not beginner better to leave this tutorial cuz it could be boring and useless for you!
but if you are a newbie bear with me cause it was such a cool problem for me.

problem:
Calculate the Hamming Distance between two DNA strands.

Your body is made up of cells that contain DNA. Those cells regularly wear out and need replacing, which they achieve by dividing into daughter cells. In fact, the average human body experiences about 10 quadrillion cell divisions in a lifetime!

When cells divide, their DNA replicates too. Sometimes during this process mistakes happen and single pieces of DNA get encoded with the incorrect information. If we compare two strands of DNA and count the differences between them we can see how many mistakes occurred. This is known as the "Hamming Distance".

We read DNA using the letters C,A,G and T. Two strands might look like this:

``````    GAGCCTACTAACGGGAT
CATCGTAATGACGGCCT
^ ^ ^  ^ ^    ^^
``````

They have 7 differences, and therefore the Hamming Distance is 7.

The Hamming Distance is useful for lots of things in science, not just biology, so it's a nice phrase to be familiar with ❤

so first of all I defined a Function and used if statement to make sure if the length of two statement are equal or not so :

``````def distance(strand_a, strand_b):

if len(strand_a) == len(strand_b):
first_strand = [letter for letter in strand_a]
second_strand = [letter for letter in strand_b]
else:
raise ValueError("The length of Sequences are not equal")
``````

but I could write this piece of code more simple, you may ask how?
like this:

``````def distance(strand_a, strand_b):
if len(strand_a) != len(strand_b):
raise ValueError("Length of two sequences most be the same")
``````

As you can see instead of writing 6-7 lines of code(first solution) I wrote second function in just 3 lines of code!

so let's see what we can do for the next part of the code...
we need to pair every iterator together with zip() function!
like this:

``````
diff = zip(first_strand, second_strand)

``````

after that I created an empty list with two purpose:

• put differences in a list
• using len() function to get the length of differences
``````
count = []

``````

With for loop we're looking in our tuples to see if paired iterators are same or not, and append the differences to an empty list which count = [] and using len(count) to get the length of differences from count and returning len(count)!

like this:

``````for x, y in diff:
if x != y:
count.append(x)
return len(count)

``````

so the complete solution would be like this:

``````def distance(strand_a, strand_b):
if len(strand_a) != len(strand_b):
raise ValueError("Length of two sequences most be the same")

count = []
zip_a_b= zip(strand_a, strand_b)

for x, y in zip_a_b:
if x != y:
count.append(x)
return len(count)
``````

EDIT:
My friend Jeremy Grifski suggested a more efficient way with less code:

it feels weird to create and throw away a list just for its length, Jeremy Grifski said!
After all he comment his clever solution to improve our code, so here it is:

``````   count = []
zip_a_b= zip(strand_a, strand_b)

for x, y in zip_a_b:
if x != y:
count.append(x)
return len(count)
``````

we are using Generator-expressions:

``````count = sum(1 for x, y in zip(strand_a, strand_b) if x != y)
return count
``````

If you want to know more about List-comprehension or Generator-expressions, I found this Link useful to understand these two concepts.

and spending you time with me.

Keep Moving Forward ツ

Code with 💛

🅑🅐🅝🅙🅘

Sean • Edited

I'm no noob, but here it is(less lines and more pythonic):

``````def distance(strand_a, strand_b):
if len(strand_a) != len(strand_b): raise ValueError("Length of two sequences most be the same")
zip_a_b = zip(strand_a, strand_b)
count = len([x for x, y in zip_a_b if x != y])
return count
``````

Your's was okay though.

Jeremy Grifski

Both are great! But, it feels weird to create and throw away a list just for its length. I thought this generator expression solution was clever:

``````count = sum(1 for x, y in zip(strand_a, strand_b) if x != y)
``````

Source: Stack Overflow

Sean

Way smarter, totally tosses the array. Absolutely, and perfecly, pythonic.

Banji

Hey Jeremy, thank you for sharing a better and clever solution in this post.
thank you for reading this post and leaving a comment to improve this solution.
Happy Coding with LoVe

Banji

I'm gonna edit my post and add your great solution in my post.
Thank you :)
Keep Moving Forward

Code with 💛

🅑🅐🅝🅙🅘

Banji

Hey Sean, tnx for sharing your better way to improve this solution.
Happy coding with LoVe

amir

made me feel good : ))))

Banji

You're making me feel amazing in this jourey.
Keep moving forward bro
Tnx for encouraging me.
Happy coding with ❤️