Contents Menu Expand Light mode Dark mode Auto light/dark mode
Light Logo Dark Logo
Aiven.io GitHub
Log in Get started for free
Light Logo Dark Logo
Get started
Light Logo Dark Logo
  • Get started
    • Free plans
    • 30-day trials
    • Set up marketplace subscriptions
      • Set up AWS Marketplace subscriptions
      • Set up Azure Marketplace subscriptions
      • Set up Google Cloud Marketplace subscriptions
    • Aiven Console overview
    • Service and feature releases
    • Feature previews
  • Organizations, units, and projects
    • Organization hierarchy and access
    • Create organizations and organizational units
    • Manage organizations
    • Manage projects
    • Project member roles
    • Manage unassigned projects
  • Billing and payments
    • Billing overview
    • Corporate billing
    • Tax information for Aiven services
    • Update your tax status
    • Payment methods
      • Manage payment cards
      • Marketplace subscriptions
        • Move to AWS Marketplace
        • Move to Azure Marketplace
        • Move to Google Cloud Marketplace
    • Billing groups
      • Billing groups overview
      • Create billing groups
      • Manage billing groups
      • Assign projects to billing groups
      • Change billing contacts
    • Payment problems when upgrading
    • Request service custom plans
  • User and access management
    • Invite and remove organization users
    • Managed users
    • Manage domains
    • Delete user account
    • Make users super admin
    • User profiles
      • Edit your user profile
      • Change your email address
    • Authentication methods
      • Add authentication methods
      • Password policy
      • Manage two-factor authentication
      • Set authentication policies
      • Authentication tokens
      • Create authentication tokens
      • SAML authentication
        • Set up SAML authentication
        • Set up SAML with Auth0
        • Set up SAML with Microsoft Azure Active Directory
        • Set up SAML with FusionAuth
        • Set up SAML with JumpCloud
        • Set up SAML with Okta
        • Set up SAML with OneLogin
        • Set up SAML with Google
    • Groups
      • Create and manage groups
      • Add groups to projects
      • Create and manage teams
    • Manage project and service notifications
    • Reactivate suspended projects
  • Service management
    • Choosing a time series database
    • Create a new service
    • Rename a service
    • Tag your Aiven resources
    • Search for services
    • Recover a deleted service
    • Create service users
    • Service forking
    • Fork your service
    • Backups at Aiven
    • Service power cycle
    • Pause or terminate your service
    • Service resources
    • Service memory limits
    • Out of memory conditions
    • Prepare services for high load
    • Scale your service
    • Dynamic Disk Sizing (DDS)
    • Disk autoscaling
    • Periodic cleanup of powered-off services
    • Add or remove storage
    • Access service logs
    • Access service metrics
    • Migrate service to another cloud or region
    • Migrate a public service to a Virtual Private Cloud (VPC)
    • End of life for major versions of Aiven services and tools
    • Service maintenance
  • Networking and security
    • List of available cloud regions
    • Availability zones
    • Bring your own cloud (BYOC)
      • About BYOC
      • Enable BYOC
      • Create custom cloud
      • Attach projects
      • Add customer contacts
      • Rename custom cloud
      • Delete custom cloud
    • Enhanced compliance environments (ECE)
    • Firewall configuration for service nodes
    • Cloud security
    • Disaster recovery testing scenarios
    • TLS/SSL certificates
    • Download CA certificates
    • Restrict network access to services
    • Enable public access in VPCs
    • Static IP addresses
    • Default service IP address and hostname
    • Manage static IP addresses
    • Handle resolution errors of private IP addresses
    • Attach VPCs to an AWS Transit Gateway
    • Manage Virtual Private Cloud (VPC) peering
    • Set up Virtual Private Cloud (VPC) peering on Google Cloud Platform (GCP)
    • Set up Virtual Private Cloud (VPC) peering on AWS
    • Set up Azure virtual network peering
    • Set up network peering between Aiven and UpCloud
    • Use AWS PrivateLink with Aiven services
    • Use Azure Private Link with Aiven services
    • Use Google Private Service Connect with Aiven services
  • Monitoring and logs
    • Monitoring services
    • About logging, metrics and alerting
    • Streaming anomaly detection
    • Amazon CloudWatch
      • CloudWatch metrics
      • CloudWatch Logs
        • Send logs to AWS CloudWatch from Aiven web console
        • Send logs to AWS CloudWatch from Aiven client
    • Datadog
      • Send metrics to Datadog
      • Increase metrics limit setting for Datadog
      • Send logs to Datadog
      • Add custom tags Datadog integration
    • Elasticsearch logs
    • Google Cloud Logging
    • Google BigQuery
    • RSyslog
      • Logtail
      • Loggly
    • Jolokia metrics
    • Prometheus metrics
      • Prometheus system metrics
  • Integrations
    • Service integrations
    • Create service integrations
  • Aiven tools
    • Aiven CLI
      • avn account
        • avn account team
      • avn billing-group
      • avn cloud
      • avn credits
      • avn events
      • avn mirrormaker
      • avn project
      • avn service
        • avn service acl
        • avn service connection-info
        • avn service connection-pool
        • avn service connector
        • avn service database
        • avn service es-acl
        • avn service flink
        • avn service integration
        • avn service m3
        • avn service privatelink
        • avn service schema-registry-acl
        • avn service index
        • avn service tags
        • avn service topic
        • avn service user
      • avn ticket
      • avn user
        • avn user access-token
      • avn vpc
    • Aiven API
    • Aiven Provider for Terraform
      • Get started
      • Data sources
      • Enable debug logging
      • Promote PostgreSQL read-only replica to primary
      • Use PostgreSQL Provider with Aiven Provider
      • Upgrade Aiven Provider
        • Upgrade from v1 to v2
        • Upgrade from v2 to v3
        • Upgrade from v3 to v4
        • Upgrade to OpenSearch® with Terraform
        • Update deprecated resources
      • Virtual network peering
        • Set up AWS virtual network peering
        • Set up Azure virtual network peering
        • Set up Google Cloud Platform virtual network peering
      • Troubleshooting
        • Private access error when using VPC
    • Aiven Operator for Kubernetes
  • Apache Kafka
    • Get started
    • Sample data generator
    • Concepts
      • Upgrade procedure
      • Scaling options
      • Access control lists and permission mapping
      • Schema registry authorization
      • Apache Kafka® REST API
      • Compacted topics
      • Partition segments
      • Authentication types
      • NOT_LEADER_FOR_PARTITION errors
      • Configuration backups for Apache Kafka®
      • Monitoring consumer groups in Aiven for Apache Kafka®
      • Consumer lag predictor
      • Quotas
      • Tiered storage
        • Overview
        • How it works
        • Guarantees
        • Limitations
    • HowTo
      • Code samples
        • Connect with Python
        • Connect with Java
        • Connect with Go
        • Connect with command line
        • Connect with NodeJS
      • Tools
        • Configure properties for Apache Kafka® toolbox
        • Use kcat with Aiven for Apache Kafka®
        • Connect to Apache Kafka® with Conduktor
        • Use Kafdrop Web UI with Aiven for Apache Kafka®
        • Use Provectus® UI for Apache Kafka® with Aiven for Apache Kafka®
        • Use Kpow with Aiven for Apache Kafka®
        • Connect Aiven for Apache Kafka® with Klaw
      • Security
        • Configure Java SSL keystore and truststore to access Apache Kafka
        • Manage users and access control lists
        • Monitor and alert logs for denied ACL
        • Use SASL authentication with Aiven for Apache Kafka®
        • Renew and Acknowledge service user SSL certificates
        • Enable OAUTH2/OIDC authentication
        • Encrypt data using a custom serde
      • Administration tasks
        • Schema registry
          • Use Karapace with Aiven for Apache Kafka®
        • Get the best from Apache Kafka®
        • Manage configurations with Apache Kafka® CLI tools
        • Manage Apache Kafka® parameters
        • View and reset consumer group offsets
        • Configure log cleaner for topic compaction
        • Prevent full disks
        • Set Apache ZooKeeper™ configuration
        • Avoid OutOfMemoryError errors in Aiven for Apache Kafka®
        • Optimizing resource usage
        • Enable consumer lag predictor
        • Manage quotas
      • Integrations
        • Integration of logs into Apache Kafka® topic
        • Use Apache Kafka® Streams with Aiven for Apache Kafka®
        • Use Apache Flink® with Aiven for Apache Kafka®
        • Configure Apache Kafka® metrics sent to Datadog
        • Use ksqlDB with Aiven for Apache Kafka
        • Add client-side Apache Kafka® producer and consumer Datadog metrics
      • Topic/schema management
        • Creating an Apache Kafka® topic
        • Create Apache Kafka® topics automatically
        • Get partition details of an Apache Kafka® topic
        • Use schema registry in Java with Aiven for Apache Kafka®
        • Change data retention period
      • Tiered storage
        • Enable tiered storage
        • Configure tiered storage for topic
        • Tiered storage overview page
    • Reference
      • Advanced parameters
      • Metrics available via Prometheus
    • Apache Kafka Connect
      • Getting started
      • Concepts
        • List of available Apache Kafka® Connect connectors
        • JDBC source connector modes
        • Causes of “connector list not currently available”
      • HowTo
        • Administration tasks
          • Get the best from Apache Kafka® Connect
          • Bring your own Apache Kafka® Connect cluster
          • Enable Apache Kafka® Connect on Aiven for Apache Kafka®
          • Enable Apache Kafka® Connect connectors auto restart on failures
          • Manage Kafka Connect logging level
          • Request a new connector
        • Source connectors
          • PostgreSQL to Kafka
          • PostgreSQL to Kafka with Debezium
          • MySQL to Kafka
          • MySQL to Kafka with Debezium
          • SQL Server to Kafka
          • SQL Server to Kafka with Debezium
          • MongoDB to Kafka
          • Handle PostgreSQL® node replacements when using Debezium for change data capture
          • MongoDB to Kafka with Debezium
          • Cassandra to Kafka
          • MQTT to Kafka
          • Google Pub/Sub to Kafka
          • Google Pub/Sub Lite to Kafka
          • Couchbase to Kafka
        • Sink connectors
          • Kafka to another database with JDBC
          • Configure AWS for an S3 sink connector
          • Kafka to S3 (Aiven)
          • Use AWS IAM assume role credentials provider
          • Kafka to S3 (Confluent)
          • Configure GCP for a Google Cloud Storage sink connector
          • Kafka to GCS
          • Configure GCP for a Google BigQuery sink connector
          • Kafka to Big Query
          • Kafka to OpenSearch
          • Kafka to Elasticsearch
          • Configure Snowflake for a sink connector
          • Kakfa to Snowflake
          • Kafka to HTTP
          • Kafka to MongoDB
          • Kafka to MongoDB (by Lenses)
          • Kafka to InfluxDB
          • Kafka to Redis
          • Kafka to Cassandra
          • Kafka to Couchbase
          • Kafka to Google Pub/Sub
          • Kafka to Google Pub/Sub Lite
          • Kafka to Splunk
          • Kafka to MQTT
      • Reference
        • Advanced parameters
        • AWS S3 sink connector naming and data format
          • S3 sink connector by Aiven naming and data formats
          • S3 sink connector by Confluent naming and data formats
        • Google Cloud Storage sink connector naming and data formats
        • Metrics available via Prometheus
    • Apache Kafka MirrorMaker2
      • Getting started
      • Concepts
        • Disaster recovery and migration
          • Active-Active Setup
          • Active-Passive Setup
        • Topics included in a replication flow
        • MirrorMaker 2 common parameters
      • HowTo
        • Integrate an external Apache Kafka® cluster in Aiven
        • Set up an Apache Kafka® MirrorMaker 2 replication flow
        • Setup Apache Kafka® MirrorMaker 2 monitoring
        • Remove topic prefix when replicating with Apache Kafka® MirrorMaker 2
      • Reference
        • List of advanced parameters
        • Known issues
        • Terminology for Aiven for Apache Kafka® MirrorMaker 2
    • Karapace
      • Getting started with Karapace
      • Concepts
        • Karapace schema registry authorization
        • ACLs definition
        • Apache Kafka® REST proxy authorization
      • HowTo
        • Enable Karapace schema registry and REST APIs
        • Enable Karapace schema registry authorization
        • Enable OAuth2/OIDC support for Apache Kafka® REST proxy
        • Manage Karapace schema registry authorization
        • Manage Apache Kafka® REST proxy authorization
  • Apache Flink
    • Overview
      • Architecture overview
      • Aiven for Apache Flink features
      • Managed service features
      • Plans and pricing
      • Limitations
    • Quickstart
    • Concepts
      • Aiven Flink applications
      • Built-in SQL editor
      • Flink tables
      • Checkpoints
      • Savepoints
      • Event and processing times
      • Watermarks
      • Windows
      • Standard and upsert connectors
      • Settings for Apache Kafka® connectors
    • HowTo
      • Get started
      • Integrate service
        • Data service integrations
        • Integrate with Apache Kafka
        • Integrate with Google BigQuery
      • Aiven for Apache Flink applications
        • Create Apache Flink applications
        • Manage Apache Flink applications
      • Apache Flink tables
        • Manage Apache Flink tables
        • Create Apache Flink tables with data sources
          • Apache Kafka-based Apache Flink table
          • Confluent Avro-based Apache Flink table
          • PostgreSQL-based Apache Flink table
          • OpenSearch-based Apache Flink table
          • PostgreSQL CDC connector-based Apache Flink table
          • Slack-based Apache Flink table
          • DataGen-based Apache Flink table
      • Manage cluster
      • Advanced topics
        • Define OpenSearch® timestamp data in SQL pipeline
    • Reference
      • Advanced parameters
  • Apache Cassandra
    • Overview
    • Quickstart
    • Concepts
      • Tombstones
      • Cross-cluster replication
    • HowTo
      • Get started
      • Connect to service
        • Connect with cqlsh
        • Connect with Python
        • Connect with Go
      • Manage service
        • Manage data with DSBULK
        • Stress test with nosqlbench
        • Migrate to Aiven
      • Manage cluster
      • Cross-cluster replication
        • Enable CCR
        • Manage CCR
        • Disable CCR
    • Reference
      • Advanced parameters
      • Metrics via Prometheus
      • Metrics via Datadog
  • ClickHouse
    • Overview
      • Features overview
      • Architecture overview
      • Plans and pricing
      • Limits and limitations
    • Quickstart
    • Concepts
      • Online analytical processing
      • ClickHouse® as a columnar database
      • Indexing and data processing in ClickHouse®
      • Disaster recovery
      • Strings
      • Federated queries
      • Tiered storage
    • HowTo
      • Get started
        • Load data
        • Secure a service
      • Connect to service
        • Connect with the ClickHouse client
        • Connect with Go
        • Connect with Python
        • Connect with Node.js
        • Connect with PHP
        • Connect with Java
      • Manage service
        • Manage users and roles
        • Manage databases and tables
        • Query databases
        • Create materialized views
        • Monitor performance
        • Read and write data across shards
        • Copy data across ClickHouse servers
        • Fetch query statistics
        • Run federated queries
      • Manage cluster
      • Integrate service
        • Connect to Grafana
        • Connect to Apache Kafka
        • Connect to PostgreSQL
        • Connect a service as a data source (Apache Kafka and PostgreSQL)
        • Connect services via integration databases
        • Connect to external DBs with JDBC
      • Tiered storage
        • Enable tiered storage
        • Configure tiered storage
        • Check tiered storage status
        • Transfer data in tiered storage
    • Reference
      • Table engines
      • Interfaces and drivers
      • Metrics in Grafana
      • Metrics via Datadog
      • Metrics via Prometheus
      • Table functions
      • S3 file formats
      • Formats for ClickHouse-Kafka data exchange
      • Advanced parameters
  • Dragonfly
    • Overview
    • Quickstart
    • Concepts
      • High availability in Aiven for Dragonfly®
    • HowTo
      • Connect to service
        • Connect with redis-cli
        • Connect with Go
        • Connect with NodeJS
        • Connect with Python
      • Data migration
        • Migrate Aiven for Redis
        • Migrate external Redis
  • Grafana
    • Overview
      • Features overview
      • Plans and pricing
    • Quickstart
    • HowTo
      • User access
        • Log in to Aiven for Grafana
        • Update Grafana® service credentials
        • OAuth configuration
      • Manage dashboards
        • Dashboard previews
        • Replace strings in Grafana® dashboards
      • Alerts and notifcations
      • Manage cluster
      • Point-in-time recovery process
    • Reference
      • Advanced parameters
      • Plugins
  • InfluxDB
    • Get started
    • Concepts
      • Continuous queries
      • InfluxDB® retention policies
    • HowTo
      • Migrate data from self-hosted InfluxDB® to Aiven
    • Reference
      • Advanced parameters for Aiven for InfluxDB®
  • M3DB
    • Get started
    • Concepts
      • Aiven for M3 components
      • About M3DB namespaces and aggregation
      • About scaling M3
    • HowTo
      • Visualize M3DB data with Grafana®
      • Monitor Aiven services with M3DB
      • Use M3DB as remote storage for Prometheus
      • Write to M3 from Telegraf
      • Telegraf to M3 to Grafana® Example
      • Write data to M3DB with Go
      • Write data to M3DB with PHP
      • Write data to M3DB with Python
    • Reference
      • Terminology
      • Advanced parameters
      • Advanced parameters M3Aggregator
  • MySQL
    • Overview
    • Quickstart
    • Concepts
      • max_connections
      • Backups
      • Memory usage
      • Replication
      • Tuning for concurrency
    • HowTo
      • Get started
      • Connect to a service
        • Connect to Aiven for MySQL® from the command line
        • Connect to Aiven for MySQL® with Python
        • Connect to Aiven for MySQL® using MySQLx with Python
        • Connect to Aiven for MySQL® with Java
        • Connect to Aiven for MySQL® with PHP
        • Connect to Aiven for MySQL® with MySQL Workbench
      • Database management
        • Create a database
        • Create remote replicas
        • Backup and restore with mysqldump
        • Disable foreign key checks
        • Enable slow query logging
        • Create new tables without primary keys
        • Create missing primary keys
        • Detect and terminate long-running queries
      • Data migration
        • Run pre-migration checks
        • Migrate to Aiven with CLI
        • Migrate to Aiven via console
      • Disk space management
        • Prevent running out of disk space
        • Reclaim disk space
        • Identify disk usage issues
      • Cluster management
    • Reference
      • Advanced parameters
      • Resource capability per plan
  • OpenSearch
    • Quickstart
      • Sample dataset: recipes
    • Overview
      • Service overview
      • Plans and pricing
    • Concepts
      • Access control
        • Understanding access control in Aiven for OpenSearch®
      • OpenSearch Security
        • Key considerations
      • OpenSearch backups
      • Indices
      • Aggregations
      • High availability in Aiven for OpenSearch®
      • OpenSearch® vs Elasticsearch
      • Optimal number of shards
      • When to create a new index
      • OpenSearch® cross-cluster replication
    • HowTo
      • Manage access control
        • Manage users and access control in Aiven for OpenSearch®
      • Connect with service
        • Connect with cURL
        • Connect with NodeJS
        • Connect with Python
      • Data management
        • Copy data from OpenSearch to Aiven for OpenSearch® using elasticsearch-dump
        • Copy data from Aiven for OpenSearch® to AWS S3 using elasticsearch-dump
        • Migrate Elasticsearch data
      • Search and aggregation
        • Search with Python
        • Search with NodeJS
        • Aggregation with NodeJS
      • Manage OpenSearch Security
        • Enable OpenSearch Security management
        • SAML authentication
        • OpenID Connect
        • Audit logs
        • OpenSearch Dashboard multi-tenancy
      • Manage service
        • Restore an OpenSearch® backup
        • Set index retention patterns
        • Create alerts with OpenSearch® API
        • Handle low disk space
        • Manage large shards
        • Cross-cluster replication
      • Integrate service
        • Manage OpenSearch® log integration
        • Integrate with Grafana®
      • Upgrade Elasticsearch clients
    • OpenSearch Dashboards
      • Getting started
      • HowTo
        • Getting started with Dev tools
        • Create alerts with OpenSearch® Dashboards
    • Reference
      • Plugins
      • Advanced parameters
      • Automatic adjustment of replication factors
      • REST API endpoint access
      • Low disk space watermarks
    • Troubleshooting
      • Troubleshoot OpenSearch® Dashboards
  • PostgreSQL
    • Overview
    • Quickstart
    • Concepts
      • aiven-db-migrate
      • DBA-type tasks
      • High availability
      • Backups
      • Connection pooling
      • Disk usage
      • Shared buffers
      • TimescaleDB
      • Upgrade and failover procedures
      • AI-powered search with pgvector
    • HowTo
      • Get started
        • Load sample dataset
      • Connect to service
        • Connect with Go
        • Connect with Java
        • Connect with NodeJS
        • Connect with PHP
        • Connect with Python
        • Connect with psql
        • Connect with pgAdmin
        • Connect with Rivery
        • Connect with Skyvia
        • Connect with Zapier
      • Administer database
        • Create additional PostgreSQL® databases
        • Perform a PostgreSQL® major version upgrade
        • Install or update an extension
        • Create manual PostgreSQL® backups
        • Restore PostgreSQL® from a backup
        • Claim public schema ownership
        • Manage connection pooling
        • Access PgBouncer statistics
        • Use the PostgreSQL® dblink extension
        • Use the PostgreSQL® pg_repack extension
        • Use the PostgreSQL® pg_cron extension
        • Enable JIT in PostgreSQL®
        • Identify PostgreSQL® slow queries
        • Detect and terminate long-running queries
        • Optimize PostgreSQL® slow queries
        • Check and avoid transaction ID wraparound
        • Prevent PostgreSQL® full disk issues
        • Enable and use pgvector
        • Check size of a database, a table or an index
        • Restrict access to databases or tables in Aiven for PostgreSQL®
      • Migrate
        • Migrate to a different cloud provider or region
        • Migrate PostgreSQL databases to Aiven via console
        • Migrate to Aiven for PostgreSQL® with aiven-db-migrate
        • Migrate to Aiven for PostgreSQL® with pg_dump and pg_restore
        • Migrating to Aiven for PostgreSQL® using Bucardo
        • Migrate between PostgreSQL® instances using aiven-db-migrate in Python
      • Replicate
        • Create and use read-only replicas
        • Set up logical replication to Aiven for PostgreSQL®
        • Enable logical replication on Amazon Aurora PostgreSQL®
        • Enable logical replication on Amazon RDS PostgreSQL®
        • Enable logical replication on Google Cloud SQL
      • Manage cluster
      • Integrate
        • Database monitoring with Datadog
        • Visualize PostgreSQL® data with Grafana®
        • Monitor PostgreSQL® metrics with Grafana®
        • Monitor PostgreSQL® metrics with pgwatch2
        • Connect two PostgreSQL® services via datasource integration
        • Report and analyze with Google Looker Studio
    • Troubleshooting
      • Connection pooling
      • Repair index
    • Reference
      • Advanced parameters
      • Connection limits per plan
      • Deprecated TLS versions
      • Extensions
      • Keep-alive connections parameters
      • Metrics exposed to Grafana
      • Resource capability per plan
      • Supported log formats
      • Terminology
  • Redis
    • Overview
    • Quickstart
    • Concepts
      • High availablilty
      • Lua scripts
      • Memory management and persistence
      • Restrict Redis commands
    • HowTo
      • Connect to service
        • Connect with redis-cli
        • Connect with Go
        • Connect with NodeJS
        • Connect with PHP
        • Connect with Python
        • Connect with Java
      • Administer database
        • Configure ACL permissions in Aiven for Redis®*
        • Data migration
          • Migrate from Redis®* to Aiven for Redis®* using the CLI
          • Migrate from Redis®* to Aiven for Redis®* using Aiven Console
      • Estimate maximum number of connection
      • Manage SSL connectivity
      • Handle warning overcommit_memory
      • Benchmark performance
    • Reference
      • Advanced parameters
    • Troubleshooting
      • Troubleshoot connection issues
  • Support
Get started for free Log in GitHub Aiven.io
Back to top

Integrate Aiven for Apache Flink® with Google BigQuery#

Aiven for Apache Flink® is a fully managed service that provides distributed, stateful stream processing capabilities. Google BigQuery is a cloud-based data warehouse that is easy to use, can handle large amounts of data without needing servers, and is cost-effective. By connecting Aiven for Apache Flink® with Google BigQuery, you can stream data from Aiven for Apache Flink® to Google BigQuery, where it can be stored and analyzed.

Aiven for Apache Flink® uses BigQuery Connector for Apache Flink as the connector to connect to Google BigQuery.

Learn how to connect Aiven for Apache Flink® with Google BigQuery as a sink using Aiven client and Aiven Console.

Prerequisites#

  • Aiven for Apache Flink service

  • A Google Cloud Platform (GCP) account.

  • Necessary permissions to create resources and manage integrations in GCP.

Configure integration using Aiven CLI#

To configure integration using Aiven CLI, follow these steps:

Step 1: Create or use an Aiven for Apache Flink service#

You can use an existing Aiven for Apache Flink service. To get a list of all your existing Flink services, use the following command:

avn service list --project <project_name> --service-type flink

Alternatively, if you need to create a new Aiven for Apache Flink service, you can use the following command:

avn service create -t flink -p <project-name> --cloud <cloud-name> <flink-service-name>

where:

  • -t flink: The type of service to create, which is Aiven for Apache Flink.

  • -p <project-name>: The name of the Aiven project where the service should be created.

  • cloud <cloud-name>: The name of the cloud provider on which the service should be created.

  • <flink-service-name>: The name of the new Aiven for Apache Flink service to be created. This name must be unique within the specified project.

Step 2: Configure GCP for a Google BigQuery sink connector#

To be able to sink data from Aiven for Apache Flink to Google BigQuery, you need to perform the following steps in the GCP console:

  • Create a new Google service account and generate a JSON service key.

  • Verify that BigQuery API is enabled.

  • Create a new BigQuery dataset or define an existing one where the data is going to be stored.

  • Grant dataset access to the service account.

Step 3: Create an external Google BigQuery endpoint#

To integrate Google BigQuery with Aiven for Apache Flink, you need to create an external BigQuery endpoint. You can use the avn service integration-endpoint-create command with the required parameters. This command will create a new integration endpoint that can be used to connect to a BigQuery service.

avn service integration-endpoint-create \
--project <project_name> \
--endpoint-name <endpoint_name> \
--endpoint-type external_bigquery \
--user-config-json '{
    "project_id": "<gcp_project_id>",
    "service_account_credentials": {
        "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
        "auth_uri": "https://accounts.google.com/o/oauth2/auth",
        "client_email": "<service_account_email>",
        "client_id": "<client_id>",
        "client_x509_cert_url": "<client_x509_cert_url>",
        "private_key": "<private_key_content>",
        "private_key_id": "<private_key_id>",
        "project_id": "<service_account_project_id>",
        "token_uri": "https://oauth2.googleapis.com/token",
        "type": "service_account"
    }
}'

where:

  • --project: Specify the name of the project where you want to create the integration endpoint.

  • --endpoint-name: Set the name of the integration endpoint you are creating.. Replace your_endpoint_name with your desired endpoint name.

  • --endpoint-type: Specify the type of integration endpoint. For example, if it’s an external BigQuery service, enter external_bigquery.

  • --user-config-json: This parameter allows you to provide a JSON object with custom configurations for the integration endpoint.The JSON object should include the following fields:

    • project_id: Your actual Google Cloud Platform project ID.

    • service_account_credentials: An object that holds the necessary credentials for authenticating and accessing the external Google BigQuery service. This object should include the following fields:

      • auth_provider_x509_cert_url: The URL where the authentication provider’s x509 certificate can be fetched.

      • auth_ur: The URI used for authenticating requests.

      • client_email: The email address associated with the service account.

      • client_id: The client ID associated with the service account.

      • client_x509_cert_url: The URL to fetch the public x509 certificate for the service account.

      • private_key: The private key content associated with the service account.

      • private_key_id: The ID of the private key associated with the service account.

      • project_id: The project ID associated with the service account.

      • token_uri: The URI used to obtain an access token.

      • type: The type of service account, which is typically set to “service_account”.

Aiven CLI Example: Creating an external BigQuery integration endpoint

avn service integration-endpoint-create --project aiven-test --endpoint-name my-bigquery-endpoint
--endpoint-type external_bigquery
--user-config-json '{
"project_id": "my-bigquery-project",
"service_account_credentials": {
    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
    "client_email": "bigquery-test@project.iam.gserviceaccount.com",
    "client_id": "284765298137902130451",
    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/bigquery-test%40project.iam.gserviceaccount.com",
    "private_key": "ADD_PRIVATE_KEY_PATH",
    "private_key_id": "ADD_PRIVE_KEY_ID_PATH",
    "project_id": "my-bigquery-project",
    "token_uri": "https://oauth2.googleapis.com/token",
    "type": "service_account"
    }
}'

Step 4: Create an integration for Google BigQuery#

Now, create an integration between your Aiven for Apache Flink service and your BigQuery endpoint:

avn service integration-create
    --source-endpoint-id <source-endpoint-id>
    --dest-service <flink-service-name>
    -t flink_external_bigquery

For example,

avn service integration-create
    --source-endpoint-id eb870a84-b91c-4fd7-bbbc-3ede5fafb9a2
    --dest-service flink-1
    -t flink_external_bigquery

where:

  • --source-endpoint-id: The ID of the integration endpoint you want to use as the source. In this case, it is the ID of the external Google BigQuery integration endpoint. In this example, the ID is eb870a84-b91c-4fd7-bbbc-3ede5fafb9a2.

  • --dest-service: The name of the Aiven for Apache Flink service you want to integrate with the external BigQuery endpoint. In this example, the service name is flink-1.

  • -t: The type of integration you want to create. In this case, the flink_external_bigquery integration type is used to integrate Aiven for Apache Flink with an external BigQuery endpoint.

Step 5: Verify integration with service#

After creating the integration between Aiven for Apache Flink and and Google BigQuery, the next step is to verify that the integration has been created successfully and create Aiven for Apache Flink applications that use the integration.

To verify that the integration has been created successfully, run the following command:

avn service integration-list --project <project-name> <flink-service-name>

For example:

avn service integration-list --project systest-project flink-1

where:

  • --project: The name of the Aiven project that contains the Aiven service you want to list integrations for. In this example, the project name is systest-project.

  • flink-1: The name of the Aiven service you want to list integrations for. In this example, the service name is flink-1, which is an Aiven for Apache Flink service.

To create Aiven for Apache Flink applications, you will need the integration ID of the Aiven for Apache Flink service. Obtain the integration_id from the integration list.

Step 6: Create Aiven for Apache Flink applications#

With the integration ID obtained from the previous step, you can now create an application that uses the integration. For information on how to create Aiven for Apache Flink applications, see avn service flink create-application.

Following is an example of a Google BigQuery SINK table:

CREATE TABLE `table1` (
    `name` STRING
)
WITH
(
    'connector' = 'bigquery',
    'Service-account' = '',
    'project-id'= '',
    'dataset' = 'bqdataset',
    'table' = 'bqtable',
    'table-create-if-not-exists' = 'true',
)

If the integration is successfully created, the service credentials and project id will be automatically populated in the Sink (if you have left them back as shown in the example above).

Configure integration using Aiven Console#

If you’re using Google BigQuery for your data storage and analysis, you can seamlessly integrate it as a sink for Aiven for Apache Flink streams. To achieve this via the Aiven Console, follow these steps:

  1. Log in to Aiven Console and choose your project.

  2. From the Services page, you can either create a new Aiven for Apache Flink service or select an existing service.

  3. Next, configure Google BigQuery service integration endpoint:

    • Navigate to the Projects screen where all the services are listed.

    • From the left sidebar, select Integration endpoints.

    • Select Google Cloud BigQuery from the list, and then select Add new endpoint or Create new.

    • Enter the following details to set up the integration:

      • Endpoint name: Enter a name for the integration endpoint. For example, Aiven_BigQuery_Integration.

      • GCP Project ID: The identifier associated with your Google Cloud Project where BigQuery is set up. For example, my-gcp-project-12345.

      • Google Service Account Credentials: The JSON formatted credentials obtained from your Google Cloud Console for service account authentication. For example:

        {
            "type": "service_account",
            "project_id": "my-gcp-project-12345",
            "private_key_id": "abcd1234",
            ...
        }
        

        For more information, see Integrate Google BigQuery endpoints with Aiven services.

      • Select Create.

  4. Select Services and access the Aiven for Apache Flink service where you plan to integrate the Google BigQuery endpoint.

  5. If you’re integrating with Aiven for Apache Flink for the first time, select Create data pipeline on the Overview page. Alternatively, you can add a new integration in the Data Flow section by using the plus (+) button.

  6. On the Data Service integrations screen, select the Create external integration endpoint tab.

  7. Select the checkbox next to BigQuery, and choose the BigQuery endpoint from the list to integrate.

  8. Select Integrate.

Once you have completed these steps, the integration will be ready. You can now start creating Aiven for Apache Flink applications that use Google BigQuery as a sink.

Did you find this useful?

Apache, Apache Kafka, Kafka, Apache Flink, Flink, Apache Cassandra, and Cassandra are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. M3, M3 Aggregator, M3 Coordinator, OpenSearch, PostgreSQL, MySQL, InfluxDB, Grafana, Terraform, and Kubernetes are trademarks and property of their respective owners. *Redis is a registered trademark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by Aiven is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and Aiven. All product and service names used in this website are for identification purposes only and do not imply endorsement.

Copyright © 2023, Aiven Team | Show Source | Last updated: January 2024
Contents
  • Integrate Aiven for Apache Flink® with Google BigQuery
    • Prerequisites
    • Configure integration using Aiven CLI
      • Step 1: Create or use an Aiven for Apache Flink service
      • Step 2: Configure GCP for a Google BigQuery sink connector
      • Step 3: Create an external Google BigQuery endpoint
      • Step 4: Create an integration for Google BigQuery
      • Step 5: Verify integration with service
      • Step 6: Create Aiven for Apache Flink applications
    • Configure integration using Aiven Console