DEV Community

Prabhu R
Prabhu R

Posted on • Updated on • Originally published at blog.teamnexus.in

Datafaker: Simplifying Test Data Generation for Java and Kotlin

In the world of software development, effective testing is crucial to ensure the reliability and functionality of applications. A significant aspect of robust testing is the use of representative and reliable test data. Datafaker, a powerful library for Java and Kotlin, simplifies the process of generating test data. In this article, we'll explore Datafaker and provide code examples for both Java and Kotlin, using the Maven coordinates to get started.

What Is Datafaker?

Datafaker is a Java and Kotlin library designed to streamline test data generation. It offers a user-friendly interface that makes creating mock data a breeze. Whether you need to generate test data for a database, API endpoints, or other testing purposes, Datafaker is the tool of choice for simplified data generation. It can be used to generate fake data for a variety of purposes, such as:

  • Testing software
  • Creating training data for machine learning models
  • Anonymizing data
  • Generating mock data for presentations

Key Features of Datafaker

Datafaker boasts a range of features that make it an indispensable tool for developers and testers:

  1. Variety of Data Types: Datafaker supports a wide array of data types, including names, addresses, phone numbers, emails, dates, numbers, and more. This versatility ensures you can generate diverse test data for different use cases.

  2. Fake Data Providers: Datafaker has many providers (233 as of 2.0.0) that are grouped under the following groups.

  • Base (Providers of everyday data)
  • Entertainment (Providers for movies, shows, books)
  • Food (Providers for different types of food)
  • Sport (Providers for different types of sport)
  • Videogame (Video game providers)
  1. Customization: You have the power to customize data generation by setting specific constraints or formats implementing a Data Provider. For instance, you can define date formats, create data within a specific range, or adhere to specific patterns.

  2. Multiple locales: Datafaker allows us to create multiple locales and also mix them easily with other locale data. The easiest way to do so is to create a Faker per locale, and mix between those fakers.

  3. Repeatable random results: To generate a more predictable and repeatable data, we can provide a seed, and the instantiation of Fake objects will always happen in a predictable way, which can be handy for generating results multiple times.

  4. Bulk Data Generation: Datafaker allows for bulk data generation, making it easy to create extensive datasets for comprehensive testing. These bulk generations can be returned as a Java Collection or Java Streams, however the test needs it.

  5. Export/Transform Data: The generated data can be easily exported/transformed in multiple formats, such as XML, JSON, CSV, and SQL, ensuring compatibility with various testing and development environments.

There are other similar projects like Java Faker, Kotlin Faker, JFairy which provide similar functionality, however, Datafaker is quite active.

Now, let's dive into code examples to illustrate how Datafaker can be used for test data generation.

Code Examples

To get started with Datafaker in Java, add the following Maven dependency:

<dependency>
    <groupId>net.datafaker</groupId>
    <artifactId>datafaker</artifactId>
    <version>2.0.2</version>
</dependency>
Enter fullscreen mode Exit fullscreen mode

Now, let's look at how you can generate random names, email addresses and phone numbers in Java:

// Java Example
import net.datafaker.Faker;

public class TestDataGeneration {
    public static void main(String[] args) {
        Faker faker = new Faker();

        // Generate random names
        String firstName = faker.name().firstName();
        String lastName = faker.name().lastName();
        String fullName = faker.name().fullName();
        String email = faker.internet().emailAddress();
        String phone = faker.phoneNumber().phoneNumber();

        // Generate a collection of names
        List<String> names = faker.collection(
            () -> faker.name().firstName(), 
            () -> faker.name().lastName())
        .len(10)
        .generate();

    }
}
Enter fullscreen mode Exit fullscreen mode

This Java code snippet generates a firstName, a lastName, a fullName, an email and a phoneNumber. Followed by generating a collection of 10 names using two Suppliers where one Supplier provides the firstName and the other the lastName.

The bulk generation can be returned as Streams as well, like in the following code snippet

Stream<String> names = 
    faker.stream(
            () -> faker.name().firstName(), 
            () -> faker.name().lastName())
        .len(10)
        .generate();
Enter fullscreen mode Exit fullscreen mode

Datafaker also provides a number of features for generating more complex data. For example, you can use Datafaker to generate fake data for:

  • Addresses
  • Companies
  • Credit cards
  • Dates and times
  • Locations
  • Products
  • Services
  • Vehicles

To generate more complex data, you can use Datafaker's providers. Providers are classes that generate fake data for a specific type of data. For example, the Address provider can generate fake addresses, while the Company provider can generate fake companies.

Here is an example of how to use Datafaker's Company provider to generate a fake company profile:

import net.datafaker.Faker;

public class Example {
    public static void main(String[] args) {
        Faker faker = new Faker();

        String name = faker.company().name();
        String catchPhrase = faker.company().catchPhrase();
        String website = faker.internet().domainName();

        System.out.println("Name: " + name);
        System.out.println("Catch phrase: " + catchPhrase);
        System.out.println("Website: " + website);
    }
}
Enter fullscreen mode Exit fullscreen mode

This code will generate a fake company profile with a random name, catch phrase, and website.

For Kotlin, the code is more or less similar except for the Kotlin constructs.

Efficient testing requires reliable and representative test data, and Datafaker excels at this task. With its intuitive interface and wide array of data generation capabilities, Datafaker proves to be a valuable tool for both developers and testers. Whether you need to generate names, addresses, user data, or any other type of test data, Datafaker is your trusted companion. Give it a try, and experience how it streamlines the testing process, saving you time and effort in the long run.

To get started with Datafaker, you can find it on Maven Central using the following Maven coordinates:

<dependency>
    <groupId>net.datafaker</groupId>
    <artifactId>datafaker</artifactId>
    <version>2.0.2</version>
</dependency>
Enter fullscreen mode Exit fullscreen mode

The documentation in the official website is also comprehensive. Please read it to understand the wide range of options it provides.

Top comments (2)

Collapse
 
snuyanzin profile image
Sergey Nuyanzin

the code from the first example above doesn't seem to be compilable

        ...
        String firstName = faker.name.firstName();
        String lastName = faker.name.lastName();
        String fullName = faker.name.fullName();
        ...
Enter fullscreen mode Exit fullscreen mode

name is a method, thus parentheses are required
so it should be like

        ...
        String firstName = faker.name().firstName();
        String lastName = faker.name().lastName();
        String fullName = faker.name().fullName();
        ...
Enter fullscreen mode Exit fullscreen mode

or even better (here no need for parentheses)

        ...
        var name =  faker.name();
        String firstName = name.firstName();
        String lastName = name.lastName();
        String fullName = name.fullName();
        ...
Enter fullscreen mode Exit fullscreen mode
Collapse
 
rprabhu profile image
Prabhu R

Thanks for pointing out. Somehow got deleted when editing. Corrected it!