Discuss Home · Bootstrapped Podcast · Scribbleton Personal Wiki · HelpSpot Customer Service Software

Data sovereignty - how do you deal with customers in multiple jurisdictions?


#21

I’d really like to bring the thread back on track - my original desire was that any SaaS business owners treating cross-border personal information share tips or experience on how they’re doing it, both from legal and technical standpoints. I believe we’ve covered enough justification of why this is worth discussing.

From here on, if you’d like to debate whether or not you believe data sovereignty is a pertinent issue, or if the issue even exists, I’d kindly ask you to start your own thread.

Personally, I’m dealing with EU, China, and no-sovereignty-law zones. I’m running everything out of EU datacenters at the moment, and now that China has recently passed their own data sovereignty laws my clients are scrambling to audit their data warehousing and also subcontractors that process personal info. At this stage, I’m looking at writing a database router for my application, adding a few columns to client profiles to enable them to set data locality preferences, and saving and retrieving the data on a Chinese VPS. This is aggravating, because:

  1. That’s a lot of work, and it’s unclear if I’ll be able to bill it
  2. Chinese VPSes cost a mint compared to other options
  3. The db server would be exposed to the internet (yes, I’d shut everything off except opening the db port to the app server IP, but it’s still less secure than in a private network).

#22

So if an account is marked as Chinese, you essentially use a separate connection pool for it, which is going to Chinese VPS (over a VPN, I assume)?

It feels like a very good solution to me.

It should satisfy any law. It shouldn’t take much work to implement. Exposing to internet is not such a big deal if you only need to open one port, and from known source hosts too. (How much Chinese VPS cost tho?)

Main issue with such multi-sovereinity SQL databases is that you cannot do a query against a table and get a result for all users; you’d have to send a request to each of the databases and then combine them. So that alone may be a reason to use one of NoSQL databases that allow to do multi-instance queries, but that only works for new projects.


#23

As you point out, it breaks the relationships stored in the DB. Those relationships are kinda important, because some Chinese and EU clients are subsidiaries of the same parent, and the parent is going to want a global view of things. It also stinks of huge technical debt. (I’m using Postgres).

A fairly small Chinese VPS cost around $300/month when I last looked a couple of years ago. Hopefully they’ve come down by now; obviously at that price I didn’t look too hard. I don’t know whether the datacenters are generally ISO27001 or not either.

Maybe it’ll work out that a cron job to extract and scp the data to somewhere in China each day will be acceptable, but that’s not a good solution for when insert country here decides to legislate some stricter requirement.


#24

Maybe an idea: use sharding within Postgres to distribute your data. Not sure how feasible that is.


#25

Citus makes a sharded Postgres db, but designed for situations where you need sharding for performance, rather than data isolation. I don’t want to move to their solution. Instagram rolled their own sharded Postgres db by time/user based primary keys with a little Pg/PSQL, but I believe their shards were located on the same server when they did this? I admit I don’t know how this would work across multiple schemas/servers/datacenters - I’m only just looking that sort of complexity recently.


#26

With regards to Russian laws, as was already stated - the laws are written in obscure way and can be interpreted as government machine wishes.

Specifically, the law about “personal information” says that “data should be processed using servers located in Russia”. It does not say that the data has to be processed using servers only in Russia. So in order to comply you can, say, mirror data on the Russian server and that would be it. You use Russian servers to process it. Theoretically.

For Russia I’d say: ignore it as long as you can (for small businesses not involved with politics = forever). When problems arise, and your site is about to be blocked, in case you have significant amount of Russian users to bother - start doing something. Keep in mind that any Russian who purchases foreign services knows ways to visit “blocked” web sites :slight_smile: Yes, good old “the severity of Russian laws is always compensated by the optionality of their execution”.