Send data to Kafka

1. Create a local file at $HOME/.confluent/python.config) with configuration parameters to connect to your Kafka cluster. Starting with the template below:

  • Template configuration file for Confluent Cloud

    # Required connection configs for Kafka producer, consumer, and admin
    bootstrap.servers={{ BROKER_ENDPOINT }}
    security.protocol=SASL_SSL
    sasl.mechanisms=PLAIN
    sasl.username={{ CLUSTER_API_KEY }}
    sasl.password={{ CLUSTER_API_SECRET }}
    
    # Best practice for higher availability in librdkafka clients prior to 1.7
    session.timeout.ms=45000

You can find the {{BROKER_ENDPOINT}} in your Confluent Cloud settings:

Input the API keys you downloaded for {{ CLUSTER_API_KEY }} and {{ CLUSTER_API_SECRET }}.

2. Clone the community repo.

$git clone git@github.com:rockset/community.git

3. Navigate to the Confluent-AWS-Rockset-Workshop:

$ cd community/workshops/Confluent-AWS-Rockset-Workshop

4. Set up your Python environment by following the instructions. Alternatively, there are instructions on the Confluent docs.

5. Once you've got the environment going, input the URL from mockaroo in both python scripts that you cloned, under url=YOUR LINK.

  • The link for user_activity_v1 data set goes to user_activity_to_kafka.py.

  • The link for user_purchases_v1 data set goes to user_purchases_to_kafka.py

You can grab the link here:

6. Run the script:

$ python user_purchases_to_kafka.py -f ~/.confluent/python.config -t user_purchases

You can confirm data is sent to a particular topic by checking the stats on Confluent Cloud:

7. Do the same for the other script:

$ python user_activity_to_kafka.py -f ~/.confluent/python.config -t user_activity

8. That's it! Now you are ready for the next section.

NOTE: You can find us on the Rockset Community if you have questions or comments about the workshop.

Last updated