Determining My Early Career Direction

· 4 min read

What AI Won’t Replace

When I started using vibe coding, I realized something profound was happening. No wonder so many PMs keep saying they’ll replace programmers—I’ve actually seen colleagues with zero experience ship web apps.

So where does that leave us CS graduates?

AI suggests focusing on complex tasks or domain-specific work. I’m not particularly interested in business logic, so let’s talk about tackling complexity instead. Think distributed systems, high concurrency, load balancing. I’m not entirely sure yet, but heading toward systems that more people use and that sit at the core of infrastructure seems like the right call. I need to consciously steer my work in this direction.

Small-scale software development and frontend work feel increasingly pointless.

Infrastructure

I’ve suddenly become obsessed with infrastructure. I like applying the barbell strategy to engineering work with high certainty.

It’s a massive field, so I had AI help me categorize it (see the table at the end).

Personally, I feel infrastructure engineering is deeply technical, and the systems you build need to be genuinely complex. However, I don’t want to go deep into hardware—I don’t have much foundation there. AI recommends roles like AI Infrastructure Engineer, ML Platform Engineer, DevOps Engineer, Cloud Architect, or AI Operations Engineer.

Plus, infrastructure skills transfer well from tech companies to quant dev roles. I just need solid financial markets knowledge, which is exactly what I’m planning to spend Saturdays learning.

Current Focus

Learning complex applications while looking for opportunities to work on complex system architecture.

Dedicating an hour daily to personal project preparation.

Stay Open

Keep an open mind.

Sometimes interesting new products—like browsers or VS Code—are happy accidents. You can’t plan everything rationally; you need intuition too. Everything is change.

AI Infrastructure Roles Breakdown

RoleResponsibilitiesTech StackLanguages
AI Hardware EngineerConfigure and manage hardware resources; design and optimize AI computing platforms (GPU, TPU, FPGA)Hardware: NVIDIA GPU, Google TPU, Intel FPGA
Optimization: CUDA, cuDNN, TensorRT
Monitoring: nvidia-smi, GPU-Z
C/C++: hardware interfaces & optimization
Python: hardware interaction
Verilog/VHDL: hardware design
AI Infrastructure EngineerDesign and manage compute and storage resource allocation; ensure efficient AI workload executionCloud: AWS, GCP, Azure, OpenStack
Resource Management: Slurm, Kubernetes
Containerization: Docker, Kubernetes
Python: automation & cloud APIs
Go: infrastructure tools
Bash/Shell: automation scripts
Data EngineerDesign, develop and optimize data pipelines; handle large-scale storage and stream processing for AI trainingBig Data: Apache Spark, Hadoop, Flink
ETL: Airflow
Databases: PostgreSQL, MySQL, BigQuery, Snowflake
Python: data processing
Scala: big data
SQL: queries & management
Java: big data frameworks
Networking EngineerManage network communication layers; optimize cross-node data transfer for low latency and high bandwidthProtocols: gRPC, HTTP/2, WebSocket
High-speed: RDMA, InfiniBand
Load Balancing: Nginx, HAProxy
C/C++: protocol implementation
Python: monitoring & automation
Bash/Shell: network scripts
Storage EngineerDesign and optimize storage architecture; ensure efficient, scalable, and secure data persistenceDistributed Storage: HDFS, Ceph, Amazon S3
Databases: Cassandra, MongoDB, Redis
Tools: DVC, ModelDB
Python: data & storage management
Java: distributed storage systems
Go: high-performance storage
SQL: query optimization
ML Platform EngineerBuild and maintain ML platforms; support automated training, experiment tracking, and model versioningFrameworks: TensorFlow, PyTorch, JAX, Keras
Model Management: MLflow, TFX, DVC
Distributed Training: Horovod, Ray, PyTorch Distributed
Python: ML algorithms & models
Bash/Shell: automation
Go: platform tools
Java: distributed training
DevOps EngineerDesign and implement CI/CD workflows; automate deployment, updates, and maintenance of AI systemsAutomation: Ansible, Terraform, Chef, Puppet
CI/CD: Jenkins, GitLab CI, CircleCI
Containerization: Docker, Kubernetes
Python: automation
Bash/Shell: scripting
Groovy: Jenkins pipelines
Go: CI/CD tools
AI Operations EngineerOperate AI systems; monitor model inference services, ensure high availability and optimize performanceMonitoring: Prometheus, Grafana, Datadog
Logging: ELK Stack
Serving: TensorFlow Serving, Triton, TorchServe
Python: monitoring & automation
Bash/Shell: operations
Go: ops tools
Java: server optimization
Security EngineerEnsure AI system security; protect data and models, prevent attacks, ensure complianceCrypto & Auth: SSL/TLS, OAuth, JWT, AES
Compliance: GDPR, HIPAA, SOC2
Container Security: Aqua Security, Twistlock
Python: security scripts
C/C++: secure communication
Go: container security
Bash/Shell: auditing
Cloud ArchitectDesign and manage AI cloud architecture; ensure high availability, elasticity, fault tolerance, and securityPlatforms: AWS, GCP, Azure
Storage: S3, Google Cloud Storage, Azure Blob
Tools: Terraform, CloudFormation
Python: automation & integration
Go: resource management
Bash/Shell: cloud ops
Java: cloud applications
System ArchitectDesign overall AI system architecture; coordinate compute, storage, and network resources efficientlyDistributed: Apache Kafka, Spark, Kubernetes
Design Tools: UML, ArchiMate
Databases: SQL, NoSQL, GraphDB
Python: automation & optimization
Go: distributed systems
Java: large-scale architecture
C/C++: low-level optimization
AI Research EngineerResearch and develop AI algorithms; propose new methods and optimizations, improve existing modelsFrameworks: TensorFlow, PyTorch, JAX
Languages: Python, C++, R
Optimization: Optuna, Hyperopt, Ray Tune
Python: algorithm development
C++: performance optimization
R: statistical analysis
Julia: high-performance computing
← Back to all posts