
PP - Data Engineer
- Brasil
- Permanente
- Período integral
- Architect, develop, and maintain scalable data pipelines and ETL workflows that support the ingestion, transformation, and storage of large datasets from diverse sources.
- Implement automated data quality checks and validation processes to ensure the accuracy, consistency, and reliability of data across systems.
- Work closely with product managers, data analysts, and business stakeholders to gather and understand data requirements, translating them into technical specifications and actionable engineering tasks.
- Continuously monitor and optimize data systems for performance, scalability, and cost-efficiency, ensuring that data infrastructure meets evolving business needs.
- Diagnose and resolve data-related issues promptly, providing root cause analysis and implementing preventive measures.
- Maintain comprehensive documentation of data pipelines, ETL processes, and system architecture. Participate in design and code reviews to uphold high engineering standards.
- Stay abreast of emerging data engineering technologies, tools, and best practices to drive innovation and continuous improvement within the team.
- Provide guidance and mentorship to junior data engineers, fostering a culture of knowledge sharing and technical excellence.
- Bachelor's degree in Computer Science, Engineering, or a related field.
- SQL: Expert-level proficiency in SQL for querying, manipulating, and optimizing relational databases. Ability to write complex queries, optimize performance, and work with large datasets efficiently.
- Python: Strong programming skills in Python, including experience with data processing libraries such as Pandas. Ability to develop robust, maintainable, and scalable data processing scripts and automation tools.
- PySpark: Proficient in using PySpark for distributed data processing on large-scale datasets. Experience with Spark’s DataFrame API, RDDs, and performance tuning in a big data environment.
- ETL (Extract, Transform, Load): Deep understanding of ETL concepts and hands-on experience designing and implementing ETL pipelines that ensure data integrity and efficiency.
- Data Modeling: Expertise in data modeling techniques to design logical and physical data models that support efficient querying and reporting. Familiarity with normalization, denormalization, and schema design best practices.
- Relational Databases: Experience working with relational database management systems (RDBMS) such as Oracle, MySQL, or similar platforms. Knowledge of database design, indexing, and query optimization.
- Data Warehousing: Solid understanding of data warehousing concepts, architectures, and best practices. Experience building and maintaining data warehouses that support business intelligence and analytics.
- Unix/Linux: Proficiency in Unix/Linux operating systems for managing data workflows, scripting, and system monitoring.
- Shell Scripting: Ability to write shell scripts to automate routine tasks, manage data pipelines, and integrate with other system components.
- Automation Testing: Experience implementing automated testing frameworks for data pipelines and ETL processes to ensure data quality and system reliability.
- Bachelor’s Degree: A Bachelor’s degree in Computer Science, Engineering, or a related technical field.
- Professional Experience: Minimum of 3+ years of proven experience as a Data Engineer or in a similar role, with a strong background in database development, ETL processes, and software development.