Lead Data Science and Analytics Engineer Houston, TX, USA Req #1865 Wednesday, June 19, 2024 POSITION SUMMARY The position is responsible for capturing requirements and performing data analysis in addition to providing data engineering expertise supporting the advancement of data driven solutions to improve operational efficiency and drive business value. This hybrid role will be involved in the full project lifecycle which includes driving discovery and requirements definition, data related development, and operational support / administration as part of the data management team responsible for the cloud data platform in Google Cloud (GCP). This position will work directly with project stakeholders and other teams on advanced analytics, BI, and data science initiatives. Key Accountabilities • Responsible for building and maintaining the GCP data environment and processes (BigQuery, Cloud SQL, Dataproc, BigTable, Stackdriver, etc). • Coordinate with business and IT partners to gather and define requirements and design data workflows. • Perform data analysis, drive data model design, define data mappings, and data validation. • Create and maintain documentation (requirements, design, data models, process diagrams, user stories, support, and training). • Work with data stewards to maintain the data catalog. • Work hand-in-hand with application development and the data science team to understand business needs and coordinate product changes. • Build prototypes for data collection, evaluation, and integration to assist with data discovery and analysis. • Build stream ingestion processes to efficiently send, process, analyze & publish data. • Build automated workflows to ingest and process batch data. • Manage workflows in support of both product and data science pipelines. • Build & deploy large-scale ETL, ELT, and stream processing pipelines in a serverless microservice infrastructure using industry standard technologies such as SQL, Python, Java, Kubernetes, Dataflow, Spark, etc. • Develop and support data integration using APIs, Pub / Sub, etc. • Perform analyses of large structured and unstructured data to solve multiple & complex business problems. • Provider operational support and build dashboards to monitor and maintain the GCP environment. • Develop processes to validate, report, and address data quality. • Responsible for setting up auditing and logging to support the data platform. • Develop frameworks and utilities as needed to monitor / support the data platform and facilitate user access to data. Required Knowledge, Skills, and Abilities • Excellent verbal and written skills with great attention to detail. • Excellent documentation skills. • Ability to multitask and prioritize effectively for multiple projects. • Capability to efficiently complete tasks in a fast-paced environment. • Ability to form effective working relationships and collaborate with different teams. • Ability to troubleshoot complex and ambiguous environments in order to find and solve problems. • Experience with on-premise, private and / or hybrid cloud solutions. • Proven experience administering, monitoring, and supporting production systems and applications. • Proven experience with RDBMS such as PostgreSQL, SQL Server and no-SQL DBs such as HBase, BigTable. • Experience writing production and modular code following coding best practices with Python, Java, Spark, Scala, or related required. • Experience building and using RESTful APIs. • Experience with Pub / Sub and Dataflow or Kafka and Spark Streaming. • Experience with IoT, time-series, or machine generated data is a plus. • Experience with data processing platforms such as Spark, Hadoop, Hive, Sqoop, Airflow, Google Cloud Platform (Dataproc, Dataflow, BigQuery, Compute Engine, Data Fusion), AWS (EMR, Kinesis, Lambda, Glue), Azure (HDInsight, Data Factory). • Experience supporting business intelligence or large-scale data warehouses / lakes using tech such as Hadoop, Kafka, IBM Cognos, Microsoft, MicroStrategy, Oracle, Snowflake, Tableau, Power BI, Teradata and / or similar tech, Unifi Software Data Catalog is a plus. • Working knowledge of Linux required, RHEL / Centos preferred. • Experience with container technologies (Docker and Kubernetes / GKE). • Familiar with Agile methodologies and related methods. • Experience with oil and gas software systems and data formats is preferred. Bachelor's Degree - Required Minimum Required Work Experience • More than 10 yearsSoftware Engineering - Required • 2-3 yearsBusiness Analyst or Project Manager - Required Other details • Job Family Digital • Job Function Digital • Pay Type Salary • Employment Indicator NexTier • Houston, TX, USA Share this job : Last updated : 2024-10-21
Cloud Solutions Engineer - Data
Cloudforce
Cloudforce is seeking a Cloud Solutions Engineer - Data to design and maintain scalable data pipelines using Azure technologies. The role involves collaborating with cross-functional teams to deliver actionable insights and drive business value.
Azure Data Platform Solution Architect
Avanade Inc.Tucson, AZ
Join Avanade Inc. as an Azure Data Platform Solution Architect, where you'll leverage data engineering expertise to drive business innovation. Collaborate with clients to transform complex requirements into impactful technical solutions.
Job Type
Fulltime role
Skills required
PostgreSQL, Python, Java, Azure, Kubernetes, Agile
Location
Houston, TX
Salary
No salary information was found.
Date Posted
October 21, 2024
Cloud Solutions Engineer - Data
Cloudforce
Cloudforce is seeking a Cloud Solutions Engineer - Data to design and maintain scalable data pipelines using Azure technologies. The role involves collaborating with cross-functional teams to deliver actionable insights and drive business value.
Azure Data Platform Solution Architect
Avanade Inc.Tucson, AZ
Join Avanade Inc. as an Azure Data Platform Solution Architect, where you'll leverage data engineering expertise to drive business innovation. Collaborate with clients to transform complex requirements into impactful technical solutions.
NexTier Oilfield Solutions is seeking a Data Science Engineer in Houston, TX, responsible for data analysis and engineering to enhance operational efficiency. This hybrid role involves collaboration with stakeholders on advanced analytics and data-driven solutions.