DEV Community

sam-nash
sam-nash

Posted on

Google Pub/Sub

Introduction: The Power of Event-Driven Architectures

Have you ever wondered how large-scale systems like Google, Netflix, or Spotify manage to process millions of real-time events, such as sending notifications, updating dashboards, or recommending personalized content, all without missing a beat? The secret lies in event-driven architectures, and at the heart of such systems is a powerful tool: Google Cloud Pub/Sub.

In a world where businesses demand seamless integration and instant processing of data, traditional point-to-point messaging systems often fall short. Enter Google Pub/Sub, a fully managed messaging service designed to decouple applications, handle massive data streams, and enable real-time communication with unparalleled reliability and scalability.

Why Pub/Sub is Essential for Modern Systems
Modern applications require more than just static workflows; they thrive on dynamic, event-driven processes that can adapt and respond to real-time data. Here are some key reasons why Google Pub/Sub has become indispensable:

Real-Time Processing: From IoT devices generating millions of sensor readings to financial systems detecting fraudulent transactions, Pub/Sub ensures these events are captured and processed in real time.

Scalability: Pub/Sub seamlessly scales to handle millions of messages per second, supporting the ever-growing demands of modern businesses.

Decoupling of Services: It simplifies architectures by allowing independent components to communicate asynchronously, improving maintainability and flexibility.

Global Reach: Built on Google's global infrastructure, Pub/Sub offers low-latency message delivery across regions.

Integration-Friendly: It integrates seamlessly with other Google Cloud services like BigQuery, Dataflow, Cloud Functions, and Cloud Storage, making it a cornerstone for building robust, interconnected systems.

Google Cloud Pub/Sub is a fully managed messaging service designed for real-time communication, enabling independent applications to send and receive messages seamlessly. It's a powerful tool for building scalable and reliable systems.

In this tutorial, we'll delve into the core concepts of Pub/Sub, explore practical examples using the command-line interface (CLI) and Python SDK, and discuss various use cases.

Key Concepts

  • Topic: A named resource to which messages are published.
  • Subscription: A named resource that receives messages from a specific topic.
  • Publisher: An application that sends messages to a topic.
  • Subscriber: An application that receives messages from a subscription.

Getting Started

  • Set Up Your Google Cloud Project:
    • Create a new Google Cloud project or select an existing one.
    • Enable the Pub/Sub API.
  • Install the Google Cloud SDK:
    • Download and install the SDK from the official website.
    • Authenticate your Google Cloud account using

gcloud auth login

Creating a Topic and Subscription
Using the CLI:

# Create a topic
gcloud pubsub topics create my-topic

# Create a subscription
gcloud pubsub subscriptions create my-subscription --topic my-topic

Using the Python SDK:

from google.cloud import pubsub_v1

# Create a Publisher client
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path("your-project-id", "my-topic")
publisher.create_topic(topic_path)```





Enter fullscreen mode Exit fullscreen mode

Create a Subscriber client

subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path("your-project-id", "my-subscription")
subscriber.create_subscription(subscription_path, topic_path)



Publishing Messages
Using the CLI:


```gcloud pubsub topics publish my-topic --message "Hello, Pub/Sub!"```



Using the Python SDK:


```# Publish a message
message = "Hello, Pub/Sub!"
future = publisher.publish(topic_path, message.encode("utf-8"))
print(future.result())
Enter fullscreen mode Exit fullscreen mode

Subscribing to Messages
Using the CLI:

gcloud pubsub subscriptions pull my-subscription --auto-ack

Using the Python SDK:

```def callback(message):
print("Received message: {}".format(message.data))
message.ack()

subscriber.subscribe(subscription_path, callback=callback)

The subscriber will stay alive until interrupted

while True:
time.sleep(60)



**Use Cases with Practical Examples**

1. Triggers with Cloud Functions
Use Case: Automatically trigger a process whenever a new message is published to a Pub/Sub topic. For example, sending an email notification when a new user registers.

Solution: Use Cloud Functions to listen to Pub/Sub topics and handle messages.

Steps:
1. Create a Pub/Sub topic:



```bash
gcloud pubsub topics create user-registration
Enter fullscreen mode Exit fullscreen mode
  1. Write a Cloud Function:
def send_email_notification(event, context):
    import base64
    message = base64.b64decode(event['data']).decode('utf-8')
    print(f"Sending email notification for: {message}")
Enter fullscreen mode Exit fullscreen mode
  1. Deploy the Cloud Function:

gcloud functions deploy sendEmailNotification \
    --runtime python39 \
    --trigger-topic user-registration \
    --entry-point send_email_notification
Enter fullscreen mode Exit fullscreen mode
  1. Publish a message to test:
gcloud pubsub topics publish user-registration --message "New User: John Doe"
Enter fullscreen mode Exit fullscreen mode

Result: The Cloud Function logs the email
notification process.

  1. Streaming Data to BigQuery Use Case: Process and analyze large volumes of event data, such as IoT sensor readings or transaction logs, by streaming messages from Pub/Sub to BigQuery.

Solution: Use a pre-built Dataflow template to ingest data into BigQuery.

Steps:

  1. Create a Pub/Sub topic and subscription:
gcloud pubsub topics create sensor-data
gcloud pubsub subscriptions create sensor-data-sub --topic=sensor-data
Enter fullscreen mode Exit fullscreen mode
  1. Create a BigQuery dataset and table:
bq mk my_dataset
bq mk --table my_dataset.sensor_table schema.json
(Replace schema.json with your table schema file.)
Enter fullscreen mode Exit fullscreen mode
  1. Run the Dataflow job:
gcloud dataflow jobs run pubsub-to-bigquery \
    --gcs-location gs://dataflow-templates-us-central1/latest/PubSub_to_BigQuery \
    --region us-central1 \
    --parameters inputTopic=projects/your-project-id/topics/sensor-data,outputTableSpec=your-project-id:my_dataset.sensor_table
Enter fullscreen mode Exit fullscreen mode
  1. Publish messages to the topic:
gcloud pubsub topics publish sensor-data --message '{"temperature": 25, "humidity": 60}'
Enter fullscreen mode Exit fullscreen mode

Result: The message data should appear in the BigQuery table for further analysis.

  1. Logging Events with Sinks Use Case: Centralize and process logs by routing Cloud Logging events to a Pub/Sub topic for downstream processing, like alerting or archiving.

Solution: Create a logging sink targeting a Pub/Sub topic.

Steps:

  1. Create a Pub/Sub topic:
gcloud pubsub topics create log-events```
{% endraw %}


2. Create a logging sink:
{% raw %}

```bash
gcloud logging sinks create log-sink \
    pubsub.googleapis.com/projects/your-project-id/topics/log-events```
{% endraw %}


3. Set IAM permissions for the sink service account:
{% raw %}

```bash
gcloud pubsub topics add-iam-policy-binding log-events \
    --member=serviceAccount:service-account-email \
    --role=roles/pubsub.publisher```
{% endraw %}


4. Create a subscriber to process the logs:
{% raw %}

```python
from google.cloud import pubsub_v1

subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path('your-project-id', 'log-events-sub')

def callback(message):
    print(f"Received log: {message.data.decode('utf-8')}")
    message.ack()

subscriber.subscribe(subscription_path, callback=callback)
print("Listening for log events...")
Enter fullscreen mode Exit fullscreen mode

Result: Log messages flow into Pub/Sub and can be processed by your custom subscriber.

  1. Streaming Messages to Cloud Storage Use Case: Archive real-time data, such as application usage metrics or user activity logs, to Cloud Storage for long-term storage.

Solution: Use Dataflow to write Pub/Sub messages to a Cloud Storage bucket.

Steps:

  1. Create a Pub/Sub topic and subscription:
gcloud pubsub topics create app-metrics
gcloud pubsub subscriptions create app-metrics-sub --topic=app-metrics
Enter fullscreen mode Exit fullscreen mode
  1. Create a Cloud Storage bucket:
gsutil mb gs://your-bucket-name
Enter fullscreen mode Exit fullscreen mode
  1. Run the Dataflow job:
gcloud dataflow jobs run pubsub-to-gcs \
    --gcs-location gs://dataflow-templates-us-central1/latest/PubSub_to_Cloud_Storage_Text \
    --region us-central1 \
    --parameters inputTopic=projects/your-project-id/topics/app-metrics,outputDirectory=gs://your-bucket-name/data/,outputFilenamePrefix=metrics,outputFileSuffix=.json
Enter fullscreen mode Exit fullscreen mode
  1. Publish messages:
gcloud pubsub topics publish app-metrics --message '{"event": "page_view", "user": "123"}'
Enter fullscreen mode Exit fullscreen mode

Result: Messages are written as JSON files in the specified Cloud Storage bucket.

Conclusion
Google Pub/Sub is a versatile tool for building scalable and reliable applications. By understanding the core concepts and leveraging the CLI and Python SDK, you can effectively utilize Pub/Sub to solve a variety of real-world problems.

Top comments (0)