How much user activity logging should my SaaS do?

SteveMcLeod · February 22, 2017, 6:19pm

For those of you actively running a SaaS, can I pick your brains?

I’m currently working on the user sign up/sign in/recover password functionality for Feature Upvote. I’m in danger of over-engineering it. I log every signin failure and success in the application log. I’m recording in the database User table the date/time of the last successful signin and last failed signin, and last modified date.

Should I be recording all sign ins (or recent sign ins) in the database?
Should I be recording IP address and user agent of each sign in?

My product is not dealing with financial data or anything else that is critical. I want to do things well, but I also don’t want to go overboard.

My MySQL user table currently looks like this:

CREATE TABLE user
(
user_id INT(11) PRIMARY KEY NOT NULL AUTO_INCREMENT,
email VARCHAR(200) NOT NULL,
salted_hash VARCHAR(100) NOT NULL,
display_name VARCHAR(200),
date_created TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
date_modified TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_signin_success TIMESTAMP,
last_signin_failure TIMESTAMP
);
CREATE UNIQUE INDEX user_email_uindex ON user (email);

Any tips from someone who has been down this path already? Anything you really wished you had baked into the user system from the beginning?

maximus · February 22, 2017, 7:00pm

Don’t add last_signin_* fields to the user table
Create audit_trail table instead (example below):

CREATE TABLE audit_trail
(
audit_trail_id serial NOT NULL,
app_name varchar(256),
method_name varchar(256) NOT NULL,
tenant_id int4,
user_name varchar(64) NOT NULL,
start_date timestamp,
end_date timestamp,
params text,
result bool,
host_ip varchar(64),
client_ip varchar(64),
exception_info varchar(2000),
exception_detail text,
partition_date timestamp NOT NULL,
CONSTRAINT audit_trail_pkey PRIMARY KEY (audit_trail_id)
);

Make sure you have all required indexes.

Record all important events, such as sign ins, sign ups, sign in attempts, etc. Simply insert into autit_trail
If needed - build autit_trail UI within your app to query events

Example: https://etlworks.com/docs/#statistics

unboot · February 22, 2017, 9:15pm

Hi,
not directly related to tracking, but maybe add a column
canonical_email VARCHAR(200) NOT NULL,
and make this the downcase of email (and check for uniqueness)

Otherwise these two “users” can subscribe:

Steve@example.com
and
steve@example.com

This can lead to funny situations …

rfctr · February 22, 2017, 9:19pm

+1

Audit table is both a cleaner separation of concerns and more flexible.

kohanz · February 22, 2017, 9:53pm

Why not just downcase the original e-mail field when you persist it. Is there a reason to preserve case for e-mails?

unboot · February 22, 2017, 10:17pm

You can do that, obviously.
But some users/companies prefer emails like: S.Smith@example.com.
You can then show the original (upcase) email to the user after loging in , for example in the top right corner of your admin panel. It’s aesthetics / UX mainly.

rfctr · February 23, 2017, 2:06am

Yes, case-aware but case-insensitive.

For audit purposes, it is also important to save the data exactly as they were entered by the user - in case this minor difference was sufficient to trigger a defect.

Rhino · February 23, 2017, 9:53am

Whats your defense against brute force attacks on username/password (either random guesses against a single login or random guesses against a bunch of passwords)?

What language & framework are you using? Is there a commonly agreed ‘best way’ to do this in that particular framework?

SteveMcLeod · February 23, 2017, 9:54am

Thanks all. This is v helpful.

So I’ll record all user sign ins in an audit trail table. And I’ll add a canonical email field to my user table.

DLJ · February 23, 2017, 3:49pm

Defo use an audit table/audit db.

keep ‘working’ tables to the bare minimum you can for the app to work.

I tend to use auth tables with just login id and hash to check against, everything else inc display name gets put in to a profile table. usually also have a prefs table too.

if you are using a framework, look in to throttling/blocking login attempts too as Rhino suggested

Clay_Nichols · February 23, 2017, 4:07pm

I vastly overengineered our logging (I use Mixpanel for this).

Now the approach I take is just enough logging to spot if there is a general issue in an area.
So if I want to see if users are logging in OK, I might just log the failures (or enough info so that I know the failure %).

So… I cast a very wide net with very big holes, looking only for the biggest fish to fry.

(How’s that for a mixed metaphor??)

FYI, your site did not come up
http://featureupvote.com/

rfctr · February 23, 2017, 4:56pm

Standard practice is to introduce an artificial delay into login. Makes brute-force impractical.

dgrigg · February 23, 2017, 4:57pm

Why not save the email as entered but use case insensitive sql for queries. Two versions is redundant. Separate table for audit is the right way to go. Depending on your needs you can even remove old entries to keep it small if needed.

SteveMcLeod · February 23, 2017, 5:03pm

Thanks for the alert. We’re currently setting up a proper production system. It seems that we’re briefly responding to https only…and even then you’ll get an AWS “Welcome” page… Should be better tomorrow.

SteveMcLeod · February 23, 2017, 5:07pm

Hmmm, I neglected to implement a protection against brute force attacks. I guess I should add per IP address throttling.

I’m using Java + “Java Spark”, which is a very immature web framework, but oh so enjoyably light compared to most Java frameworks. It looks like I’ll have to add sign in throttling myself.

Rhino · February 23, 2017, 5:25pm

I ask as in addition to ip address throttling and random delay an app I’ve been involved with before kept a count of # failed logins against a username (your [user] table) - when > X could block temporarily or big delay, when > Y then could block for manual review. Reset on successful login.

(Just doing IP throttling won’t pick up a distributed attack focused against an individual - guess that would be very unlikely for your app but for the work involved for # failed logins might be worth it anyway).

rfctr · February 23, 2017, 5:43pm

… which unfortunately creates a door for DoS attack against a particular user - just fail their login repeatedly and they won’t be able to login.

Salman · February 23, 2017, 6:00pm

It really depends on what action items you want with this type of logging.

If you want to see usage patterns, you might use something like mixpanel or statsd (https://github.com/etsy/statsd).

If you are looking to do rate limiting then there is a different approach.

I was once using this app and I fiddled with the URL i.e. altered the ID in the querystring, and my account was immediately disabled. Now that was hard core

Rhino · February 23, 2017, 6:01pm

Good point - depends on the trade offs per application. (e.g. pretty sure most banks do this as always have phone/branch banking as backup).

I guess things like this is why looking for drop in components that implement all the best practices is maybe futile.

rfctr · February 23, 2017, 6:48pm

I believe blocking the user after failed logins is suitable for internal networks, where the security team then can proceed and find an app or person who does that - and eliminate it (cause, not person; tho who knows, these are the hard times).

In internet the find and eliminate part is simply not working, so the blocking backfires.

For internet I’d rather temporarily (for 5 minutes) block the IP from which we get a few failed attempts. That would make most script attacks on a user much harder to implement, while at the same time would be largely invisible to real users.