Anand Das for Bito

Posted on Jul 13, 2023 • Originally published at bito.ai

Is Anthropic's Claude 2 model ready to take down GPT-4? We put them to the test

#ai #chatgpt #gpt3 #python

Anthropic released Claude 2, a new iteration of its AI model, to take on ChatGPT and Google Bard head-to-head.

In this article, I’ve compared Claude 2 with Claude 1.3 and GPT-4 to find out which one has better coding capabilities.

If you are curious to know the result, this article is for you!

To see how good each model is, we are going to give each model three prompts and rate them on the following factors:

Quality of the Code
Completeness of the Solution
Quality of the Explanation

Sneak peek into our findings:

AI Models	Comparison Result (Editor's Views)
Claude 1.3	Getting obsolete. Does not provide efficient and thorough suggestions as compared to Claude 2 and GPT-4.
Claude 2	Claude 2 is giving GPT-4 a run for its money, and it might just pull off an incredible victory soon!
GPT-4	For now, performs better than both Claude 1.3 and Claude 2 in all three tests.

But before we dive in, let me briefly introduce each model.

Claude 1.3

Claude 1.3, developed by Anthropic, is an AI language model known for its safety-focused approach. With significant funding support, Claude 1.3 showcases advanced capabilities in coding and natural language processing, reflecting Anthropic’s commitment to responsible AI development.

Claude 2

Anthropic’s Claude 2 is a powerful AI language model that surpasses its predecessor. With a token capacity of up to 100,000, Claude 2 delivers more contextually accurate responses. It excels in various domains, scoring high in law, mathematics, and coding, demonstrating its advanced computational skills and versatility.

GPT-4

GPT-4, the latest in the GPT series by OpenAI, represents a significant leap in AI language models. Advancements in logical reasoning, scientific knowledge, and text generation make GPT-4 a transformative force in natural language processing, paving the way for enhanced chatbots, virtual assistants, and content generation capabilities. Its release generates anticipation and sets the stage for a new era in language modeling technology.

Now, let’s get started.

Coding Questions

1st Prompt:

Design and implement an efficient algorithm for finding the longest common subsequence between two strings.

Claude 1.3

Here is an implementation of a dynamic programming algorithm to find the longest common subsequence between two strings in Python:

def lcs(string1, string2):
  # Create a matrix to store lengths of longest common subsequences
  m = len(string1)
  n = len(string2)
  L = [[0 for x in range(n+1)] for y in range(m+1)]

  # Build the matrix in bottom-up manner
  for i in range(m+1):
    for j in range(n+1):
      if i == 0 or j == 0:
        L[i][j] = 0
      elif string1[i-1] == string2[j-1]:
        L[i][j] = L[i-1][j-1] + 1
      else:
        L[i][j] = max(L[i-1][j], L[i][j-1])

  # Backtrack to find the actual subsequence
  index = L[m][n]

  lcs = [""] * (index+1)
  lcs[index] = ""

  i = m
  j = n
  while i > 0 and j > 0:
    if string1[i-1] == string2[j-1]:
      lcs[index-1] = string1[i-1]
      i -= 1
      j -= 1
      index -= 1
    elif L[i-1][j] > L[i][j-1]:
      i -= 1
    else:
      j -= 1

  return "".join(lcs)

The key steps are:

Create a matrix to store lengths of longest common subsequences.
Build the matrix in bottom-up manner by considering all possible cases.
Backtrack based on the matrix to find the actual LCS.

The time complexity is O(mn) where m and n are lengths of the input strings as we iterate over the entire matrix. The space complexity is O(mn) to store the matrix.

Let me know if you would like me to explain or walk through any part of the implementation in more detail!

My Review:

Overall, the code is a good implementation of the dynamic programming algorithm for finding the longest common subsequence.

Here are some specific things I liked about the code:

The comments are clear and helpful.
The structure of the code is logical and easy to follow.
The code is well-formatted.

Here are some specific things that could be improved:

The code could be made more concise by using a more compact syntax.
The code could be tested more thoroughly.

Claude 2

Here is an efficient algorithm for finding the longest common subsequence (LCS) between two strings:

def lcs(X, Y):
    m = len(X)
    n = len(Y)

    # construct a matrix to store lengths of LCS
    L = [[0 for j in range(n+1)] for i in range(m+1)]

    # fill the matrix in bottom up manner
    for i in range(m+1):
        for j in range(n+1):
            if i == 0 or j == 0:
                L[i][j] = 0
            elif X[i-1] == Y[j-1]:
                L[i][j] = L[i-1][j-1] + 1
            else:
                L[i][j] = max(L[i-1][j], L[i][j-1])

    # L[m][n] contains the length of LCS
    return L[m][n]

This implements the classic dynamic programming approach to finding the LCS. The key steps are:

Initialize a matrix L of dimensions (m+1) x (n+1) to store lengths of LCS.
Fill the matrix in bottom-up manner by comparing characters of X and Y.
If characters match, take diagonal value and add 1.
If characters don’t match, take maximum of values on top and left.
L[m][n] contains the length of LCS.

The algorithm runs in O(mn) time and O(mn) space. By storing only the current and previous rows of L, space can be optimized to O(n).

Some key advantages of this algorithm are:

Optimal substructure property is used to build up solution iteratively
Much more efficient than the naive recursive approach
Easily extended to print out the actual LCS sequence

Let me know if you need any clarification or have additional questions!

My Review:

Claude 2 provided the more space-efficient code. This is because it only stores the current and previous rows of the matrix, instead of the entire matrix. It is also easy to understand, but not as easy to understand as the Claude 1.3 answer.

GPT-4

To design an efficient algorithm for finding the longest common subsequence (LCS) between two strings, we can use dynamic programming. The algorithm will have a time complexity of O(mn), where m and n are the lengths of the input strings.

Here’s the implementation of the algorithm in Python:

def longest_common_subsequence(str1, str2):
m = len(str1)
n = len(str2)

# Create a matrix to store the lengths of LCS
lcs_matrix = [[0] * (n + 1) for _ in range(m + 1)]

# Compute the lengths of LCS for all subproblems
for i in range(1, m + 1):
    for j in range(1, n + 1):
        if str1[i - 1] == str2[j - 1]:
            lcs_matrix[i][j] = lcs_matrix[i - 1][j - 1] + 1
        else:
            lcs_matrix[i][j] = max(lcs_matrix[i - 1][j], lcs_matrix[i][j - 1])

# Retrieve the LCS by backtracking through the matrix
lcs = ""
i, j = m, n
while i > 0 and j > 0:
    if str1[i - 1] == str2[j - 1]:
        lcs = str1[i - 1] + lcs
        i -= 1
        j -= 1
         elif lcs_matrix[i - 1][j] > lcs_matrix[i][j - 1]:
        i -= 1
    else:
        j -= 1

return lcs

Now, let’s test the algorithm with an example:

str1 = "AGGTAB"
str2 = "GXTXAYB"
lcs = longest_common_subsequence(str1, str2)
print(lcs)  # Output: GTAB

The algorithm will output the longest common subsequence between the two input strings, which in this case is “GTAB”.

My Review:

Better than both the Claude 1.3 and 2. It is efficient, space-efficient, and easy to understand. The code is well-commented and well-formatted, and the algorithm is easy to extend to print out the actual LCS sequence.

2nd Prompt:

Implement a cryptographic algorithm such as RSA or AES for secure data encryption and decryption.

Claude 1.3

Here is an implementation of the RSA public-key cryptosystem for encryption and decryption in Python:

import random
import math

# Generate large random primes p and q
def generate_primes():
  p = get_random_prime(100)
  q = get_random_prime(100)
  return p, q

# Get a random n-bit prime number
def get_random_prime(n):
  while True:
    num = random.randrange(2**(n-1), 2**n)
    if isPrime(num):
      return num

# Test if a number is prime
def isPrime(num):
  if num < 2:
    return False
  for i in range(2, int(math.sqrt(num))+1):
    if num % i == 0:
      return False
  return True

# Compute gcd to ensure p and q are coprime
def gcd(p, q):
  while q != 0:
    p, q = q, p % q
  return p

# Generate encryption and decryption keys
def generate_keys(p, q):
  n = p * q
  phi = (p-1) * (q-1)

  # Choose encryption key between 1 and phi
  e = random.randrange(1, phi)
  g = gcd(e, phi)
  while g != 1:
    e = random.randrange(1, phi)
    g = gcd(e, phi)

  # Compute decryption key
  d = pow(e, -1, phi)

  return ((n, e), (n, d))

# Encrypt message m
def encrypt(pk, m):
  n, e = pk
  c = pow(m, e, n)
  return c

# Decrypt cipher c
def decrypt(sk, c):
  n, d = sk
  m = pow(c, d, n)
  return m

# Driver code
p, q = generate_primes()
public, private = generate_keys(p, q)
message = 123456
encrypted = encrypt(public, message)
decrypted = decrypt(private, encrypted)

print("Original Message:", message)
print("Encrypted:", encrypted)
print("Decrypted:", decrypted)

The key steps are:

Generate large random primes p and q
Compute n = pq and phi = (p-1)(q-1)
Choose encryption key e coprime to phi
Compute decryption key d such that ed ≡ 1 (mod phi)
Encrypt message m as c = m^e (mod n)
Decrypt cipher c as m = c^d (mod n)

This implements the core RSA algorithm for public key encryption. We can further improve security by using larger key sizes, randomized encryption padding, key splitting, etc.

My Review:

The code is well-commented and well-formatted, and it is easy to understand the key steps of the algorithm. The task could be improved by using larger key sizes, randomized encryption padding, and key splitting. These techniques can further improve the security of the RSA algorithm.

Claude 2

Here is an example implementation of the AES encryption algorithm in Python:

import base64
from Crypto.Cipher import AES

key = b' sixteen byte key'

def encrypt(raw):
  raw = pad(raw)
  cipher = AES.new(key, AES.MODE_ECB)
  enc = cipher.encrypt(raw)
  return base64.b64encode(enc).decode('utf-8')

def decrypt(enc):
  enc = base64.b64decode(enc)
  cipher = AES.new(key, AES.MODE_ECB)
  dec = cipher.decrypt(enc)
  return unpad(dec).decode('utf-8')

def pad(s):
  return s + (16 - len(s) % 16) * chr(16 - len(s) % 16)

def unpad(s):
  return s[:-ord(s[len(s)-1:])]

plaintext = 'secret message'
ciphertext = encrypt(plaintext)
decrypted = decrypt(ciphertext)

print(f'Plaintext: {plaintext}')
print(f'Ciphertext: {ciphertext}')
print(f'Decrypted: {decrypted}')

This implements AES in ECB mode, padding the plaintext to 16 byte blocks. A 16 byte random key is specified. The ciphertext is base64 encoded before output.

For RSA, we would need to generate a public/private key pair, encrypt using the public key, decrypt using the private key. Here is an example RSA implementation:

from Cryptodome.PublicKey import RSA

key = RSA.generate(2048)
private_key = key.export_key()
public_key = key.publickey().export_key()

plaintext = 'secret message'
ciphertext = public_key.encrypt(plaintext.encode('utf-8'), 32)

decrypted = private_key.decrypt(ciphertext)
print(decrypted.decode('utf-8'))

This generates a new 2048 bit RSA key, then encrypts the plaintext with the public key and decrypts with the private key.

These are just simple examples but illustrate encrypting/decrypting data with AES and RSA in Python. There are many options and best practices around key generation, modes of operation, padding schemes etc that would be needed for a production system.

My Review:

Claude 2 implemented the AES algorithm for symmetric key encryption in Python. It is a simpler algorithm than the RSA algorithm, and it is more efficient. However, it is not as secure as the RSA algorithm.

The code is well-commented and well-formatted, and it is easy to understand the key steps of the algorithm. However, it could be improved by using larger key sizes.

GPT-4

Sure! I can provide you with a high-level overview and code examples for both RSA and AES encryption and decryption algorithms. Let’s start with RSA.

RSA (Rivest-Shamir-Adleman) is an asymmetric encryption algorithm widely used for secure data transmission and digital signatures. It involves the use of a public key for encryption and a private key for decryption.

Here’s an example of how you can generate RSA keys, encrypt and decrypt data using the cryptography library in Python:

from cryptography.hazmat.primitives.asymmetric import rsa, padding
from cryptography.hazmat.primitives import serialization, hashes

# Generate RSA key pair
private_key = rsa.generate_private_key(
    public_exponent=65537,
    key_size=2048
)
public_key = private_key.public_key()

# Serialize and save private key
private_pem = private_key.private_bytes(
    encoding=serialization.Encoding.PEM,
    format=serialization.PrivateFormat.PKCS8,
    encryption_algorithm=serialization.NoEncryption()
)
with open('private_key.pem', 'wb') as f:
    f.write(private_pem)

# Serialize and save public key
public_pem = public_key.public_bytes(
    encoding=serialization.Encoding.PEM,
    format=serialization.PublicFormat.SubjectPublicKeyInfo
)
with open('public_key.pem', 'wb') as f:
    f.write(public_pem)

# Encrypt data using the public key
data = b"Hello, World!"
ciphertext = public_key.encrypt(
    data,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)
print("Encrypted data:", ciphertext.hex())

# Decrypt data using the private key
plaintext = private_key.decrypt(
    ciphertext,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)
print("Decrypted data:", plaintext.decode())

Now let’s move on to AES (Advanced Encryption Standard), which is a symmetric encryption algorithm widely used for secure data storage and transmission.

AES operates on fixed-size blocks of data and uses a symmetric key for both encryption and decryption.

Here’s an example of AES encryption and decryption using the cryptography library in Python:

from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend

# Generate a random AES key
key = b'Sixteen byte key'  # Change this to your own key

# Generate a random initialization vector (IV)
iv = b'Initialization V'  # Change this to your own IV

# Encrypt data using AES-CBC mode
plaintext = b"Hello, World!"
backend = default_backend()
cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=backend)
encryptor = cipher.encryptor()
ciphertext = encryptor.update(plaintext) + encryptor.finalize()
print("Encrypted data:", ciphertext.hex())

# Decrypt data using AES-CBC mode
decryptor = cipher.decryptor()
decrypted_data = decryptor.update(ciphertext) + decryptor.finalize()
print("Decrypted data:", decrypted_data.decode())

Remember to handle key management, secure key exchange, and other security considerations based on your specific use case.

Please note that the examples provided here are for educational purposes and may not cover all aspects of secure implementation. For real-world scenarios, it’s recommended to use well-tested and trusted cryptographic libraries or consult a security professional.

Let me know if you need any further assistance!

My Review:

Better than both the Claude 1.3 and Claude 2 because it provided both the RSA and AES encryption and decryption algorithms. It also uses the OAEP padding scheme in RSA, which is a secure padding scheme that helps to protect against attacks.

Apart from that, It uses the CBC mode of operation in AES, which is a secure mode of operation that helps to protect against attacks.

3rd Prompt:

Create a function that generates all possible combinations of a given string. For example, given the string “abc”, the function should return an array with all possible combinations: [“a”, “b”, “c”, “ab”, “ac”, “bc”, “abc”].

Claude 1.3

Here is a Python function to generate all possible combinations of a given string:

def generate_combinations(string):
  results = []
  n = len(string)

  # Generate all 2^n possible combinations
  for i in range(2**n):
     combo = []
     for j in range(n):
        if (i >> j) % 2 == 1:
          combo.append(string[j])
     results.append(''.join(combo))

  return results

The key steps are:

Calculate 2^n where n is the length of the input string. This gives us the total number of possible combinations.
Loop through numbers from 0 to 2^n – 1. Each number represents a unique binary string that indicates which characters to include in each combination.
Convert the number to binary. Check each bit from right to left. If the bit is 1, append that character to the current combination.
Join the characters and add each combination to the results list.

For example, for “abc”:

There are 2^3 = 8 possible combinations
Loop from 0 (000 in binary) to 7 (111 in binary)
000 = a, 001 = b, 010 = c, 011 = ab, etc.
Append each combo and return all 8 combinations

The time complexity is O(n * 2^n) since we loop from 0 to 2^n and create each combination.

Let me know if you need any clarification or have additional questions!

My Review:

Claude 1.3 implemented a Python function to generate all possible combinations of a given string. The function is well-commented and easy to understand. The key steps of the algorithm are well-explained.

The function uses a simple and efficient algorithm to generate all possible combinations. The algorithm works by looping through all 2^n possible combinations, where n is the length of the input string.

The function could be improved by using a more efficient data structure, but the current implementation is still sufficient for most applications.

Claude 2

Here is a function that generates all possible combinations of a given string:

from itertools import combinations

def get_combinations(string):
    combinations = []
    for i in range(1, len(string)+1):
        combinations += list(combinations(string, i))
    return combinations

To break this down:

Import the combinations function from the itertools module. This will allow us to efficiently generate all combinations.
Create an empty list to store the results.
Use a for loop to iterate from 1 to the length of the string + 1. This will generate all combinations from length 1 to length of the string.
For each iteration, get all combinations of the string of length i using combinations(string, i).
Append the results to the combinations list.
After the loop, return the combinations list containing all possible combinations.

To test it:

print(get_combinations("abc"))
# ['a', 'b', 'c', 'ab', 'ac', 'bc', 'abc']

So this generates all possible combinations by leveraging the power of itertools.combinations to efficiently get the subsets of the desired length in each iteration.

My Review:

Good job than Claude 1.3. Claude 1.3 used a binary representation of the combinations, which is less efficient but Claude 2 used the itertools.combinations function, which is more efficient than using a binary representation of the combinations. The combinations function takes two arguments: the string and the length of the combinations. The function returns a list of all combinations of the string of the specified length.

GPT-4

Here’s a Python implementation of a function that generates all possible combinations of a given string:

def generate_combinations(s):
    combinations = []

    # Helper function to generate combinations recursively
    def generate_helper(substring, index):
        if index == len(s):
            combinations.append(substring)
            return

        # Exclude the current character
        generate_helper(substring, index + 1)

        # Include the current character
        generate_helper(substring + s[index], index + 1)

    generate_helper("", 0)
    return combinations

# Example usage
input_string = "abc"
result = generate_combinations(input_string)
print(result)

Output:

css
Copy code
['', 'c', 'b', 'bc', 'a', 'ac', 'ab', 'abc']

The function generate_combinations takes a string s as input and initializes an empty list called combinations to store the generated combinations. It also defines a helper function generate_helper that takes a substring and an index as parameters.

The generate_helper function uses recursion to generate combinations. At each step, it has two choices: include the current character at the index or exclude it. It calls itself twice, once by excluding the current character and once by including it. The base case is reached when the index equals the length of the string, and at this point, it adds the generated substring to the combinations list.

The generate_combinations function calls the helper function with an empty initial substring and index 0. Finally, it returns the combinations list containing all possible combinations of the input string.

My Review:

Good Job! GPT-4 used recursion to generate all possible combinations of a given string. The recursion is a better way to reduce the errors.

Give Rating to each Model

Quality of the Code

Model	Quality of Code	Rating
Claude 1.3	Adequate code, but it may lack some refinements and optimizations present in other models.	3 out of 5
Claude 2	Good quality code, which demonstrates solid engineering practices and is well-structured.	4 out of 5
GPT-4	Highly advanced code, better than all AI Models in terms of quality and optimization.	5 out of 5

Completeness of the Solution

Model	Completeness of the Solution	Rating
Claude 1.3	Less comprehensive but its solution was complete.	4 out of 5
Claude 2	Complete solutions, addressing various aspects of the problem at hand with a higher degree of completeness.	4.5 out of 5
GPT-4	Comprehensive solutions to a variety of user queries and exhibit an excellent level of completeness..	5 out of 5

Quality of the Explanation

Model	Completeness of the Solution	Rating
Claude 1.3	Less robust and thorough explanations compared to other	3.5 out of 5
Claude 2	More detailed and informative explanations, but still not as good as GPT-4.	4 out of 5
GPT-4	Highly informative and well-explained responses, thanks to advancements in natural language processing.	5 out of 5

Conclusion

AI Models Scores	Quality of the Code Scoring Range (1 -5)	Completeness of the Solution Scoring Range (1 -5)	Quality of the Explanation Scoring Range (1 -5)	Total Scores 5 + 5 + 5 = 15
Claude 1.3	3 Limited	4 Good	3.5 Good	10.5 Average
Claude 2	4 Good	4.5Good	4 Good	12.5 Good
GPT-4	5 Excellent	5 Excellent	5 Excellent	15 Overall Winner

Based on the above ratings, GPT-4 is the clear winner in terms of coding capabilities. It has the highest scores for all three categories: quality of code, completeness of the solution, and quality of the explanation.

Claude 2 is a close second, with strong scores in all three categories. Claude 1.3 is a good choice for those who need a model with extensive documentation, but it is not as efficient or as easy to understand as Claude 2 or GPT-4.

Overall, the three models are all impressive in their own way. They showcase the power of large language models and their potential to revolutionize the way we interact with computers.

Top comments (1)

sara john • Dec 10 '23

Interesting comparison. I'd like to see this practiced on a real world example like creating a react component or refactoring some logic.

DEV Community

Is Anthropic's Claude 2 model ready to take down GPT-4? We put them to the test

Claude 1.3

Claude 2

GPT-4

Coding Questions

1st Prompt:

Claude 1.3

Claude 2

GPT-4

2nd Prompt:

Claude 1.3

Claude 2

GPT-4

3rd Prompt:

Claude 1.3

Claude 2

GPT-4

Give Rating to each Model

Quality of the Code

Completeness of the Solution

Quality of the Explanation

Conclusion

Top comments (1)

Read next

Boost Your Code's Efficiency: Introducing Semantic Cache with Qdrant

Unraveling the Mysteries of Deepfakes: A Journey Through Innovation and Iteration

Gemini-1.5-pro Function Calling with Java, Spring and HTTP

Uncover Hidden Flaws: How Gap Analysis Can Revolutionize Your Software Testing Strategy