Blog

Generate test data using Mockaroo and Postman for relational database tables

by: Feby George

Background

Mockaroo is a great tool that helps you to generate and download large amounts of realistic test data. Mockaroo has many built-in data types and supports regular expressions to generate any custom data type requirements and can be a good tool in any testing teams’ arsenal. A common use case that we might come across is generating test data for multiple tables in a relational database. Most of the times we may need to generate different number of rows of data for various tables as well (5 orders for each customer as an example).  In this post let us look at using mockaroo to accomplish this.

While mocking relational tables in mockaroo, you will have to:

1. generate the data in parent table.
2. download the data in csv format.
3. upload the downloaded csv as a dataset.
4. use the dataset as a reference in your child table.

Let us look at an example here, with customers and orders tables:

What if you must generate a fresh data set? You will have to repeat the above steps to maintain data integrity. The process becomes tedious and error-prone if done manually. Fortunately, mockaroo has APIs to generate data and upload a dataset.

1. Generate customers table data and download it.

curl https://api.mockaroo.com/api/generate.csv?key=xxxxxxxx&count=10&schema=customers > "customers.csv"

2. Upload the customers dataset.

curl -X POST https://api.mockaroo.com/api/datasets/customers?key=xxxxxxxx -H "Content-Type: text/csv" --data-binary @customers.csv

3. Generate orders table data.

Problem

So far so good, but what if you need to generate test data not just for two related tables, but for a relational database with large number of tables? For example , let us add a products table to the mix and the orders table has a reference to the products table as well.

 

Now, the sequence of events for test data generation will be

1. Generate data in customer table and download it.
2. Upload customers dataset.
3. Generate data in products table and download it.
4. Upload products dataset.
5. Generate data in orders table.

As you can see, it becomes unmanageable as and when we add more related tables. Also think about scenarios where you need to generate 1000 orders for 100 customers using 50 different products?

Solution

We can build one simple API collection in postman with just 2 requests and will iteratively call the mockaroo APIs to generate and upload data. The first request generates the test data, the second one uploads the results of first call as a dataset. the process then gets repeated for all tables mentioned in the iteration data file.

1. Click here and import a copy of the collection along with an environment file named ‘mockaroo’.

2. In the ‘mockaroo’ environment file, update the variable ‘api_key’ with your API key from mockaroo. (You can sign up for a free account and then find your api key here)

3. Create an iteration data csv file specifying the tables, and the number of records required per table. for example, to generate 1000 orders for 100 customers using 50 different products use the file below.

table_name,number_of_records
customers,100
products,50
orders,1000

Make sure you put the tables in the proper order, where parent tables data should be generated first followed by child tables. You can use a tool like SchemaSpy to inspect your database and find out the insertion order. Now run the collection, with the environment file and iteration data file selected. Voila! your relational tables are ready with some realistic test data, with proper data integrity in place.

Relevant posts

The Chatbot Testing

Imagine visiting a new website. A lot of things pop up on the homepage. One such pop-up is a virtual chat box with someone from the company that can answer all your preliminary questions and queries and redirects you to the related page on the website or the concerned people in the company.

Tale of Exhausting API Regression to Efficient Automation with Postman

Performing API regression testing manually and not getting the desired result is a meticulous task and it also requires to re-run the APIs multiple times just to make sure the outcome you are getting is still, correct.