Reliable, transparent lead delivery platform with multi-CRM integration –
principal engineer, team lead, architect, lead coder
Legacy system frequently failed, and failed silently, to deliver vital leads to customers—with 80% of company revenue in lead generation.
Leads sometimes double-delivered or double billed.
Cost prohibitive to maintain/evolve: team couldn’t add features customers demanded.
Daily batch delivery meant customers waited while their conversion rates fell.
Integration points include multiple CRM types, client CRM connections, internal Salesforce for billing, CassandraDB, and external data services, any of which can fail at any time.
Great web engineers started unsuited to systems data challenge. Professional client services team wrapped around axle of legacy tool.
Took project and team in hand, analyzing business and technical problems with internal Client Services staff.
Recognizing the core problem as one of reliability and transparency, not scale, opted for Postgres DB on AWS RDS: eventual consistency was no solution here. Used consistent DB to replace flaky daily batches with once-and-only-once streaming/“realtime” delivery, reporting-to-billing, and web dashboard—across multiple ephemeral node.js app/worker servers.
Used functional-dependency-injection in server for TDD and control of dependency/complexity.
Used integration TDD for easy, repeatable testing against the full set of real services and CRM types.
AWS AutoScaling and ephemeral, stateless EC2 nodes (multi-AZ) behind ELB.
Architected system and built team culture to discern transient errors at integration points (to retry) from intransient, data-specific errors (to filter or alert).
Realtime web dashboard and control with Angular.js, json/WebSockets, postgres DB trigger-on-delta to publish/subscribe channel, twitter Bootstrap/Less, semantic-CSS, TDD, gulp.
Guided and mentored team to competence in this unfamiliar systems world, including integration know-how and taking total ownership of devops.
Thrilled our customers with realtime delivery and rock-solid reliability, with immediate sales response (realtime increases customer ROI).
Saved Client Services staff time and headache handling.
Project team continues to confidently innovate—and no longer wastes time fixing legacy system.
Machine learning-driven recommender system for email delivery –
initiator and team lead
Change.org’s manual email targeting tools and processes were delivering significantly sub-optimal results and thwarted scaling user base, content base (petitions), professional campaign staff, and international expansion.
These manually targeted mass emails drove 80% of traffic and thus revenue.
The company had great things to get started, but was rapidly outgrowing them: “what gets us here won’t get us there.”
The dedicated campaign staff needed something radically better.
I started the change.org “Data/science” team with this project as first mission.
We empirically discovered which data sets, data processes, technologies, and machine learning algorithms garnered best results—and productized the winners into a system used daily by US and international staff to reach an email audience of 30+ million users.
Tech: AWS distributed and ephemeral compute solution including MapReduce with Cascading on AWS EMR; on demand machine learning (for laboratory and production) on scaled-to-job-size EC2 clusters coordinated with AWS Simple WorkFlow (SWF); dynamic results visualization (confidence curves) and reach/impact tradeoff control web tool using D3.
30% immediate conversion rate improvement; dramatic campaign staff efficiency and focus improvements;
staff in total control over reach/impact tradeoffs.
Move analytics to Amazon Redshift with ETL –
initiator and team lead
Emergent performance collapse for heavy analytics queries on legacy disk-based MySQL slave/replica of huge production DB.
Recognizing the company’s increasing dependence on a vital and limited resource, I worked with management to identify this key opportunity to avoid breakdown.
Selected Amazon Redshift to provide us affordable/easy/proven scalability, true SQL for broadest use, and advanced analytics features like window functions.
Oversaw development and architected Extract-Transform-Load (ETL) pipeline from production MySQL.
Led team to embrace fast-cycle empiricism, owning devops in cloud, making tools that Just Work, and Test Driven data processing development.
Solution has scaled happily with both dataset and query load;
allowed development of increasingly sophisticated and diverse tools and analytics visualizations;
freed staffers and managers across the company to meet their own needs using direct SQL queries without taxing precious advanced analyst resources.
High-throughput fraud analysis tool –
primary engineer, inception to full production
Rise in signature fraud swamping engineering’s fraud review process and tools, while requiring skilled engineers to perform this chore, leading to large and growing productivity and morale cost in engineering team.
Uncaught fraud at change.org can explode in the press as brand damage.
Created fraud analysis tool using Redshift, Redis, node.js, angular.js, HTML5.
Recognized unique challenge of human-driven fraud analysis: visual pattern matching over many fields for thousands of items/hour, applying a series of heuristics.
Used automatic precomputation and redis caching of heavy “first pass” Redshift query results together with zero-wait UX design to effect tightest possible scan-decide-act cycle for users.
Designed high-information-density UI for visual scanning (C.f. Edward Tufte) and controlled workflow.
That puts users in the flow state, the only way to get both efficient user performance and sufficient filtering quality.
Allows non-engineers to perform this work;
10-20 engineer hours/week freed immediately (to much rejoicing), growing 2-4x over following year;
allows international staff to deal with non-US fraud.