Create Many-To-One relationships Between Columns in a Synthetic Table with PySpark UDFs | Towards Data Science

I’ve recently been playing around with Databricks Labs Data Generator to create completely synthetic datasets from scratch. As part of this, I’ve looked at building sales data around di...

By · · 1 min read
Create Many-To-One relationships Between Columns in a Synthetic Table with PySpark UDFs | Towards Data Science

Source: Towards Data Science

I’ve recently been playing around with Databricks Labs Data Generator to create completely synthetic datasets from scratch. As part of this, I’ve looked at building sales data around different stores, employees, and customers. As such, I wanted to create relationships between the columns I was artificially populating – such as mapping employees and customers to […]