About the Role:
We are seeking a Data Engineer to design, develop, and maintain scalable data pipelines, webhooks delivery systems, and indexer-as-a-service products. This role is ideal for someone with strong expertise in real-time data processing, event-driven architectures, and distributed systems.
As a key contributor, you will work with technologies like Kafka, PostgreSQL, Parquet, and message queues to ensure efficient, reliable, and scalable data movement across applications. You will collaborate with engineering, product, and data teams to build high-performance systems that support real-time event processing, API integrations, and large-scale data indexing.
Key Responsibilities:
- Build and maintain scalable data pipelines for ingesting, processing, and storing structured and unstructured data.
- Own and optimize webhook delivery infrastructure, ensuring reliable event transmission with retry mechanisms, rate limiting, and observability.
- Develop and maintain an indexer-as-a-service platform, enabling efficient search and retrieval of large-scale datasets.
- Leverage Apache Kafka and message queues to process and distribute real-time event streams.
- Optimize storage formats using Apache Parquet for efficient querying and storage.
- Manage PostgreSQL databases, ensuring high availability, performance, and scalability.
- Work closely with backend and frontend teams to integrate APIs, webhook systems, and indexing solutions.
- Implement monitoring, alerting, and observability tools to track pipeline performance, webhook reliability, and indexer efficiency.
- Ensure data integrity, security, and compliance across all data systems.
Required Skills & Qualifications:
- Proficiency in Python for data processing, automation, and API integration.
- Strong experience with PostgreSQL and other relational databases, including indexing, partitioning, and performance tuning.
- Expertise in Apache Kafka and event-driven architectures.
- Familiarity with Parquet file format for optimized storage and query performance.
- Experience with containerization and orchestration (Docker, Kubernetes).