Hadoop Big Data Developer

Role: Data EngineerCompany: CapB InfoteK

Job description

JOB DESCRIPTION: "Spark, Scala/Python, HIVE, Hadoop, BIGDATA developer with Exposure to Cloud (Azure Preferably). • 4-5 Years of experience in Building and Implementing data ingestion and curation process developed using Big data tools such as Spark (Scala/python), Hive, Spark, HDFS, Sqoop, Hbase, Kerberos, Sentry and Impala etc. • Ingesting huge volumes data from various platforms for Analytics needs and writing high-performance, reliable and maintainable ETL code Strong SQL knowledge and data analysis skills for data anomaly detection and data quality assurance. • .Hands on Experience on writing shell scripts. Complex SQL queries, Hadoop commands and Git.4 • Good Hands-on creating Database, Schemas, Hive tables (External and Managed) with various file formats (Orc, Parquet, Avro and Text etc.), Complex Transformations, Partitioning, bucketing and Performance optimizations. • .Recent Exposure to Cloud will be a good to have. Azure will be preferred.6.Spark • Complex transformations, data frames, semi-structured data, utilities using spark, Spark Sql and spark configurations.7.Proficiency and extensive Experience with Spark & Scala/Python and performance tuning is a MUST. • Monitoring performance of production jobs and advising any necessary infrastructure changes. • .Ability to write abstracted reusable code components. • .Code versioning experience using Bitbucket and CI/CD pipe line.

Requirements

• 4-5 Years of experience in Building and Implementing data ingestion and curation process developed using Big data tools such as Spark (Scala/python), Hive, Spark, HDFS, Sqoop, Hbase, Kerberos, Sentry and Impala etc

• Ingesting huge volumes data from various platforms for Analytics needs and writing high-performance, reliable and maintainable ETL code Strong SQL knowledge and data analysis skills for data anomaly detection and data quality assurance

• .Hands on Experience on writing shell scripts

• Complex SQL queries, Hadoop commands and Git.4

• Good Hands-on creating Database, Schemas, Hive tables (External and Managed) with various file formats (Orc, Parquet, Avro and Text etc.), Complex Transformations, Partitioning, bucketing and Performance optimizations

• .Recent Exposure to Cloud will be a good to have

SHARE THIS OPENING

Similar jobs

Azure Databricks - Data Engineer

Jun 22, 2025

Infosys

Bladenboro, North Carolina

Azure

Python

Infosys is seeking an Azure and Databricks Data Engineer to drive digital transformation for clients. The role involves technical proficiency in Azure, Databricks, and data engineering practices.

Azure Data Engineer (2) - 5794581

Jun 22, 2025

Accenture

Milwaukee, Wisconsin

Azure

Synapse

Python

Accenture is seeking an Azure Data Engineer to create and manage data pipelines using Azure technologies. The role involves developing data transformations and ensuring data quality for enterprise solutions.

DataBricks Admin

Jun 22, 2025

vTech Solution Inc

Jersey City, New Jersey

The DataBricks Admin at vTech Solution Inc will manage and deploy applications on the Azure Databricks platform, supporting machine learning initiatives. The role requires collaboration with various teams to ensure effective implementation and monitoring of Databricks environments.

Save job

Apply now

Job Type

Fulltime role

Skills required

No particular skills mentioned.

Location

Cary, North Carolina

Salary

No salary information was found.

Date Posted

May 28, 2025

Save job Apply now

Similar jobs

Azure Databricks - Data Engineer

Jun 22, 2025

Infosys

Bladenboro, North Carolina

Azure

Python

Infosys is seeking an Azure and Databricks Data Engineer to drive digital transformation for clients. The role involves technical proficiency in Azure, Databricks, and data engineering practices.

Azure Data Engineer (2) - 5794581

Jun 22, 2025

Accenture

Milwaukee, Wisconsin

Azure

Synapse

Python

DataBricks Admin

Jun 22, 2025

vTech Solution Inc

Jersey City, New Jersey

CapB InfoteK

CapB InfoteK is seeking a Hadoop Big Data Developer with expertise in Spark, Scala/Python, and cloud technologies. The role involves building data ingestion processes and optimizing ETL code for analytics.

Grow your career with our tailored content for Microsoft techies

Learn more