Front-end Jobs nearNew York, NY
Full Stack Software Engineer
ResoluteAI is a fast-growth data aggregation and intelligent search startup, with a mission to enable scientifically driven organizations to make their next big discovery.
We are looking for full-stack developers to help build out our capabilities as we continue to grow.
Duties and Responsibilities:
Integrate public domain datasets into the search platform using our ETL framework.
Design and implement new features that integrate and highlight the various public datasets.
Work with various external enterprise search APIs to ingest and enrich clients’ documents for search.
Assist machine learning engineers with deploying their models into production
Communicate directly with clients on requirements, progress, and setting expectations.
Assist with bug fixes and writing of regression and unit tests.
Education and Previous Experience:
Bachelor’s degree in Computer Science, or equivalent experience in software development
5+ years in related experience and training
Knowledge and Skills:
Strongly skilled in Python. Experience with SQL and an ORM (e.g. sqlalchemy).
Can hold your own on the frontend UI in ReactJS
Comfortable working with Docker and in a Linux environment
Experience with ElasticSearch – very strong plus
Experience with Elastic Map Reduce or Spark is a plus
Knowledge of networking, especially AWS services and infrastructure is a plus
Must have a self-starter mentality and willingness to dig deep into the code.
Ability to own your work and adopt an outcomes-based approach, in a fast paced environment of quickly evolving requirements.
What we offer:
Competitive salary, 125k - 145k
Equity in the company, 0.25% - 0.5%
Full benefits (Medical, Dental, Vision, FSA, Commuter benefit)
401(k) and matching
Unlimited Paid Time Off
Flexible Work Environment
Senior Software Engineer
We're building the Data Platform of the Future
Join us if you want to rethink the way organizations interact with data. We are a developer-first company, committed to building around open protocols and delivering the best experience possible for data consumers and publishers.
Splitgraph is a seed-stage, venture-funded startup hiring its initial team. The two co-founders are looking to grow the team to five or six people. This is an opportunity to make a big impact on an agile team while working closely with the
Splitgraph is a remote-first organization. The founders are based in the UK, and the company is incorporated in both USA and UK. Candidates are welcome to apply from any geography. We want to work with the most talented, thoughtful and productive engineers in the world.
Data Engineers welcome! The job titles have "Software Engineer" in them, but at Splitgraph there's a lot of overlap
between data and software engineering. We welcome candidates from all engineering backgrounds.
→ Apply to Job ← (same form for both positions)
What is Splitgraph?
## Open Source Toolkit
Our open-source product, sgr, is a tool for building, versioning and querying reproducible datasets. It's inspired by Docker and Git, so it feels familiar. And it's powered by PostgreSQL, so it works seamlessly with existing tools in the Postgres ecosystem. Use Splitgraph to package your data into self-contained
data images that you can share with other Splitgraph instances.
## Splitgraph Cloud
Splitgraph Cloud is a platform for data cataloging, integration and governance. The user can upload data, connect live databases, or "push" versioned snapshots to it. We give them a unified SQL interface to query that data, a catalog to discover and share it, and tools to build/push/pull it.
Learn More About Us
Listen to our interview on the Software Engineering Daily podcast
Watch our co-founder Artjoms present Splitgraph at the Bay Area ClickHouse meetup
Explore the public data catalog where we index 40k+ datasets
How We Work: What's our stack look like?
We prioritize developer experience and productivity. We resent repetition and inefficiency, and we never hesitate to automate the things that cause us friction. Here's a sampling of the languages and tools we work with:
- Python for the backend. Our core open source tech is written in Python (with a bit of C to make it more interesting), as well as most of our backend code. The Python code powers everything from authentication routines to database migrations. We use the latest version and tools like pytest, mypy and Poetry to help us write quality software.
- TypeScript for the web stack. We use TypeScript throughout our web stack. On the frontend we use React with next.js. For data fetching we use apollo-client with fully-typed GraphQL queries auto-generated by graphql-codegen based on the schema that Postgraphile creates by introspecting the database.
PostgreSQL for the database, because of course. Splitgraph is a company built around Postgres, so of course we are going to use it for our own database. In fact, we actually have three databases. We have
auth-dbfor storing sensitive data,
registry-dbwhich acts as a Splitgraph peer so users can push Splitgraph images to it using sgr, and
cloud-dbwhere we store the schemata that Postgraphile uses to autogenerate the GraphQL server.
PL/pgSQL and PL/Python for stored procedures. We define a lot of core business logic directly in the database as stored procedures, which are ultimately exposed by Postgraphile as GraphQL endpoints. We find this to be a surprisingly productive way of developing, as it eliminates the need for manually maintaining an API layer between data and code. It presents challenges for testing and maintainability, but we've built tools to help with database migrations and rollbacks, and an end-to-end testing framework that exercises the database routines.
PostgREST for auto-generating a REST API for every repository. We use this excellent library (written in Haskell) to expose an OpenAPI-compatible REST API for every repository on Splitgraph (example).
- Lua (luajit 5.x), C, and embedded Python for scripting PgBouncer. Our main product, the "data delivery network", is a single SQL endpoint where users can query any data on Splitgraph. Really it's a layer of PgBouncer instances orchestrating temporary Postgres databases and proxying queries to them, where we load and cache the data necessary to respond to a query. We've added scripting capabilities to enable things like query rewriting, column masking, authentication, ACL, orchestration, firewalling, etc.
- Docker for packaging services. Our CI pipeline builds every commit into about a dozen different Docker images, one for each of our services. A production instance of Splitgraph can be running over 60 different containers (including replicas).
- Makefile and docker-compose for development. We use a highly optimized Makefile and
docker-compose so that developers can easily spin-up a stack that mimics production in every way, while keeping it easy to hot reload, run tests, or add new services or configuration.
- Nomad for deployment and Terraform for provisioning. We use Nomad to manage deployments and background tasks. Along with Terraform, we're able to spin up a Splitgraph cluster on AWS, GCP, Scaleway or Azure in just a few minutes.
- Grafana, Prometheus, ElasticSearch, and Kibana for monitoring and metrics. We believe it's important to self-host fundamental infrastructure like our monitoring stack. We use this to keep tabs on important metrics and the health of all Splitgraph deployments.
- Mattermost for company chat. We think it's absolutely bonkers to pay a company like Slack to hold your company communication hostage. That's why we self-host an instance of Mattermost for our internal chat. And of course, we can deploy it and update it with Terraform.
- Matomo for web analytics. We take privacy seriously, and we try to avoid including any third party scripts on our web pages (currently we include zero). We self-host our analytics because we don't want to share our user data with third parties.
- Metabase and Splitgraph for BI and dogfooding. We use Metabase as a frontend to a Splitgraph instance that connects to Postgres (our internal databases), MySQL (Matomo's database), and ElasticSearch (where we store logs and DDN analytics). We use this as a chance to dogfood our software and produce fancy charts.
- The occasional best-of-breed SaaS services for organization. As a privacy-conscious, independent-minded company, we try to avoid SaaS services as much as we can. But we still find ourselves unable to resist some of the better products out there. For organization we use tools like Zoom for video calls, Miro for brainstorming, Notion for documentation (you're on it!), Airtable for workflow management, PivotalTracker for ticketing, and GitLab for dev-ops and CI.
Life at Splitgraph
We are a young company building the initial team. As an early contributor, you'll have a chance to shape our initial mission, growth and company values.
We think that remote work is the future, and that's why we're building a remote-first organization. We chat on Mattermost and have video calls on Zoom. We brainstorm with Miro and organize with Notion.
We try not to take ourselves too seriously, but we are goal-oriented with an ambitious mission.
We believe that as a small company, we can out-compete incumbents by thinking from first principles about how organizations interact with data. We are very competitive.
Flexible working hours
Generous compensation and equity package
Opportunity to make high-impact contributions to an agile team
How to Apply? Questions?
If you have any questions or concerns, feel free to email us at email@example.com
Apply now and work remotely at Splitgraph