Back to jobs
Senior Systems Engineer
SurveyMonkey | Palo Alto, USA
SurveyMonkey is the world's leading provider of web-based survey solutions, but there's really much more to us than that. We're a smart, passionate group of people who work hard to deliver the best survey experience on the planet, period. We do this because we believe everyone deserves easy access to the insights and information they need to make better, more informed decisions.
We're also proud to admit that despite our incredible growth over the past 15 years, we refuse to grow up. We are still small and nimble; everyone plays an impactful role; and when we say good ideas can come from anyone, we mean it.
SurveyMonkey is trusted by millions of customers, including 99% of the Fortune 500, as well as other businesses, academic institutions and organizations of all shapes and sizes. We collect 2.8 million survey responses daily from people in all countries around the world.
If this sounds like home to you, and you're ready to make your work matter to millions, we'd love to meet you.
The Senior System Engineer will be responsible for the reliability, availability and directing strategic decisions on our underlying infrastructure. As part of the operations team you will be part of a fast paced enthusiastic team which is responsible for the 24x7x364 success of a high performance, dynamic environment. SurveyMonkey operates one of the largest commercial Python SOA implementations on the web, serving 5.5 million surveys created by over 2 million customers (22 million questions answered each day). To support the continuous growth, strong leadership is required as we push towards the bleeding edge of new technologies (Open Stack, Docker, Mesosphere, CEPH, etc).
Design, implement, and support programatic solutions to support day to day production, daily administration and scaling issues
Perform deep dives into both systemic and latent reliability issues; partner with software and database engineers across the organization to produce and roll out fixes
Support shared infrastructure (e.g.: caching, persistence layer, messaging queue)
Design and build the tools to manage our rapidly growing infrastructure
Support our devops and engineering team(s) and continuous delivery of SurveyMonkey products to our customers
Troubleshoot issues across the entire stack - hardware, software, application and network
Take part in a shared 24x7 on-call rotation that won’t cripple your life or kill your soul
Demonstrated competence in shell scripting and/or high-level languages (Python and Ansible a plus)
Strong communication and documentation skills
Understanding of Global scale SOA systems, to support the scaling and tuning of a large-scale website.
Practical knowledge of various aspects of service design, including messaging protocols & behavior, caching strategies and software design practices
Solid understanding of Linux internals (we favor Ubuntu) and package management.
Experience with a variety of core service tools (such as memcache, redis, zookeeper, kafka, hadoop, nginx, etc.)
Experience working with operational infrastructure tools; Open Stack, VMware, KVM
Experience laying the foundation to build new technologies on (Mesos, Docker, CEPH, etc)
Experience with managing large scale infrastructure and systems, technologies, and protocols including but limited to (Bind9, SSH, SAN, NFS, SMB, Zabbix, Splunk and New Relic)
Strong understanding of modern networking (Routing, switching, firewall,TCP/IP, multicast, unicast, subnets, MLAG, iRules)
Passion about automating anything and everything
Be able to lift, rack and stack production hardware
Ability to travel domestically and internationally (Conferences, Remote Offices and Data Centers).
At SurveyMonkey, we offer competitive salaries, medical/dental benefits, PTO, 401k, paid holidays, and equity compensation.
SurveyMonkey is an equal opportunity employer.