Account Takeover: Fraud Series #1
Pattern Analysis; Bonus- SQL: DATEDIFF, TIME ZONE, LAG & more.
This one is for analysts preparing for interviews at the various fraud detection and prevention companies and roles, in which analysts are often examined concerning their approach to investigating a case study via data.
Key points:
Approach the data analytically. I recommend starting vertically (per variable/column), then horizontally (per row), and then looking at the entire story.
Be verbal- Clarifications & assumptions. Explain your methodology, thoughts.
It’s a good opportunity to show creativity and critical thinking skills.The data itself is neither good nor bad- Compose good and bad story/stories per set of data. Then decide on your final answer if the account has been taken over or not, degree of confidence, et cetera (as the format may change per interviewer).
Think outside the box- Consider what would you say in case you’d be asked, for example, “what other data/variables would’ve helped you assess the situation?”,
or “what manipulations/calculations to the existing data/variables would assist you in making a better assessment?”
Side note: For an introduction to the world of fraud detection and prevention, I highly recommend the book of Gilit Saporta and Shoshana Maraney, “Practical Fraud Prevention: Fraud and AML Analytics for FinTech and eCommerce Using SQL and Python” (O’Reilly, 2022). The book also has a chapter dealing with the subject of Account Takeovers.
Note that you can create for yourself additional practice sets precisely like I did here:
(1) I got data samples, (2) structured some scenarios to teach ChatGPT, and then (3) instructed it to generate new cases for me to practice. Make sure you instruct it to get creative in building a stories behind each case to facilitate better results.
If done face to face, or given space to leave text, you may ask for clarifications / note your assumptions. In this case, let’s say you may assume:
All records refer to bank/financial institutions accounts.
All of the records start from the first login of the users to their accounts.
Each case is for one user, one account, and no account has more than one user.
LoginDateTime was localized (see Bonus part); format: MM/DD/YYYY HH:MM.
Case Study #1: Tina
Case Study #2: Alice
Case Study #3: Osher
My approach and some thoughts:
In general, I recommend - particularly when given to do it on Excel, paper, or white board - to start vertically, then horizontally, then entire case.
Vertically: Understand the variables & mark any unique part in the data itself
(e.g. the tiny change in user agent because later it’s easier to think, for instance, if a tiny change in user agent is possibly an upgrade of same device or different device, etc.).Horizontally, row by row, begin seeing how the combination of the different variables connect with one another and what you can make of it. That’s a good time to think of manipulations/calculations with minimal biases and heuristics.
Evaluate the story as a whole to reach the final decision.
Case Study #1: Tina -
Bottom-line: Good story triumphed over bad story.
I imagined Tina traveled with her laptop (user agent showed it was the same device throughout the records), and the login pattern seemed consistent in terms of 1 login per day at around same time frame, except for when she enjoyed her vacation.
Case Study #2: Alice -
Bottom-line: Bad story triumphed over good story. Row 8 was when she had her details exposed (her device was probably hacked too), suspecting row 13 was the fraudster logged into her account to sniff around.
I imagined Alice owning two devices, iPhone and iPad, which she routinely used from the same place (always connection from the same place).
Her routine: daily account-check, aside from when she wanted to install the official application on a new device. She then seemed to first, log into her account via web, and only then used the direct link to get to download, install, and log into the bank’s official application that she used thereon.
The 7 days of the week was also an indicator worth keeping in mind when I realized I was looking at consecutive days. In Alice’s case, the logins’ hour/time of the day had a pattern relevant to whether it was a work day or weekend. This was much easier to see and verify using SQL (though if such a test is on a white board or paper - just say it!).
SELECT CASE WHEN DATENAME(DW,LoginDateTime) IN ('Saturday', 'Sunday')
THEN 'weekend'
ELSE 'workday'
END AS isWeekend
, DATENAME(DW,LoginDateTime) AS [DayName]
, FORMAT(LoginDateTime, 'HH:mm') AS [HourMinute]
, *
FROM ATO
WHERE userID = 'N26'
Note the “LastSiteVisited” on row 8 provided a strong indication that the fraudster used an overlay as the modus operandi. This cue helped interpret the rest of the data too.
Moreover, I suspected row 13 was the fraudster logged into the account because it imitated Alice’s activity and data. But it didn’t fit in her routine: not using application once installed on device; 2nd login that day; hour would fit during the week (when account was compromised), not weekend.
Furthermore, while I believe faking device data is possible, I don’t think it’s possible to fake IP. To the best of my knowledge, it’d require access to the network. That was why I thought her device/network were hacked and not just the credentials stolen.
Case Study #3: Osher -
Bottom-line: Good story triumphed over bad story. Always lean towards the good story unless the bad story overwhelmingly beats the good/plausible story. In short, no clear indication of ATO or strong suspicion of one.
Osher checked the account once per 3 days from a number of different devices (1 phone, 1 tablet, 3 computers). 3 computers made me imagine some high tech employee with his own personal computer, work computer and possibly old PC or spouse’s or so computer. Hybrid work. It may also mean one of the different locations and IPs result from logging in from the company office (likely earlier logins from his own house due to the variety of devices).
With the exception of two times, he usually logged in around lunch break. Yet I see no reason to suspect the deviation from this pattern when it was around the same location, pattern of 1 check per 3 days remained, same way to login, and device was not new.
Last login could have many good explanations as well (2nd office, change of apartment, change of ISP of same apartment, etc.).
SELECT userID
, FORMAT(LoginDateTime, 'yyyy-MM-dd HH:mm') AS LoginDateTime
, CASE WHEN DATENAME(DW,LoginDateTime) IN ('Friday', 'Saturday')
THEN 'weekend'
ELSE 'workday'
END AS isWeekend
, DATENAME(DW,LoginDateTime) AS [DayName]
, FORMAT(LoginDateTime, 'HH:mm') AS [HourMinute]
, COALESCE(
DATEDIFF(DAY,
LAG(LoginDateTime) OVER (ORDER BY LoginDateTime ASC)
, LoginDateTime)
,0) AS DaysBetweenLogins
, COALESCE(
DATEDIFF(HOUR,
LAG(LoginDateTime) OVER (ORDER BY LoginDateTime)
, LoginDateTime)
,0) AS HoursBetweenLogins
, Device
, DeviceGroup
, Browser
, OS
, UserAgent
, [IP]
, ISP
, CountryLevel
, StateRegionLevel
, CityLevel
, LastSiteVisited
FROM ATO
WHERE userID = 'N31'
Thinking outside of the box: What additional data could be useful here? Perhaps…
VPN used- Boolean
Name of network connected from
Amount of time logged into the account (and calculate average)
Pages viewed inside account / number of pages viewed
Any ‘biometric-behavior’ data of the user
Bonus
Built in time zone converters are smarter and easier to use, and it is worth to know such options exist, particularly when trying to immerse yourself in the case, finding the anomaly and indicators someone other than the rightful owner accessed the account.
AT TIME ZONE is a feature in T-SQL that allows for the conversion of datetime values between different time zones. It automatically accounts for Daylight Saving Time adjustments based on the specified time zone and the date of the datetime value.
Use SELECT * FROM sys.time_zone_info to know the exact time zone names SQL Server uses.
SELECT userID
, CountryLevel
, StateRegionLevel
, CASE
-- Convert from Israel time zone to UTC
WHEN CountryLevel = 'Israel' THEN
CONVERT(DATETIME, LoginDateTimeLocalized AT TIME ZONE 'Israel Standard Time' AT TIME ZONE 'UTC')
-- Convert from USA - California time zone to UTC
WHEN CountryLevel = 'USA' AND StateRegionLevel = 'California' THEN
CONVERT(DATETIME, LoginDateTimeLocalized AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC')
-- Convert from USA - Hawaii time zone to UTC
WHEN CountryLevel = 'USA' AND StateRegionLevel = 'Hawaii' THEN
CONVERT(DATETIME, LoginDateTimeLocalized AT TIME ZONE 'Hawaiian Standard Time' AT TIME ZONE 'UTC')
-- Convert from USA - Hawaii time zone to UTC
WHEN CountryLevel = 'USA' AND StateRegionLevel = 'Maryland' THEN
CONVERT(DATETIME, LoginDateTimeLocalized AT TIME ZONE 'Eastern Standard Time' AT TIME ZONE 'UTC')
-- Add more conditions for other states/regions as needed
ELSE null -- Default behavior (quick indicator of missing combination)
END AS LoginDateTimeUTC
, LoginDateTimeLocalized
FROM step1
Create ATO table & case studies data
If you want to run the SQL code on your own.
CREATE TABLE ATO (
userID NVARCHAR(50) NOT NULL,
LoginDateTime DATETIME NOT NULL,
Device NVARCHAR(100),
DeviceGroup NVARCHAR(100),
Browser NVARCHAR(100),
OS NVARCHAR(100),
UserAgent NVARCHAR(MAX),
IP NVARCHAR(50),
ISP NVARCHAR(100),
CountryLevel NVARCHAR(100),
StateRegionLevel NVARCHAR(100),
CityLevel NVARCHAR(100),
LastSiteVisited NVARCHAR(255)
);
-- Alice data
INSERT INTO ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/01/2021 18:30:00 PM', 'Apple iPhone', 'Smart Phone', 'Safari', 'iOS', 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', 'https://www.google.com');
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/01/2021 18:45:51 PM', 'Apple iPhone', 'Smart Phone', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/02/2021 14:01:11 PM', 'Apple iPhone', 'Smart Phone', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/03/2021 14:05:22 PM', 'Apple iPhone', 'Smart Phone', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/04/2021 18:12:34 PM', 'Apple iPhone', 'Smart Phone', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/05/2021 18:07:18 PM', 'Apple iPhone', 'Smart Phone', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/06/2021 18:40:18 PM', 'Apple iPad', 'Tablet and iPad', 'Chrome', 'iOS', 'Mozilla/5.0 (iPad; CPU OS 12_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/76.0.3809.81 Mobile/15E148 Safari/605.1', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', 'https://www.bankofameri.ca/redirecter');
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/06/2021 18:43:37 PM', 'Apple iPad', 'Tablet and iPad', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPad; CPU OS 12_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/07/2021 18:06:06 PM', 'Apple iPad', 'Tablet and iPad', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPad; CPU OS 12_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/08/2021 18:20:16 PM', 'Apple iPad', 'Tablet and iPad', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPad; CPU OS 12_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/09/2021 14:14:14 PM', 'Apple iPad', 'Tablet and iPad', NULL, 'iOS', 'AppleApp_bankOfAmerica/23.10 (iPad; CPU OS 12_3 like Mac OS X)', '203.0.113.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', NULL);
INSERT INTO alice_ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES ('N26', '10/09/2021 19:31:01 PM', 'Apple iPad', 'Tablet and iPad', 'Chrome', 'iOS', 'Mozilla/5.0 (iPad; CPU OS 12_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/76.0.3809.81 Mobile/15E148 Safari/605.1', '203.0.118.45', 'Verizon', 'USA', 'Maryland', 'Baltimore', 'https://www.google.com');
--- Osher data
INSERT INTO ATO (userID, LoginDateTime, Device, DeviceGroup, Browser, OS, UserAgent, IP, ISP, CountryLevel, StateRegionLevel, CityLevel, LastSiteVisited) VALUES
('N31', '9/2/2023 13:27', 'iPhone', 'iPhone', 'Safari', 'iOS', 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1', '212.143.19.183', 'Bezeq', 'Israel', 'Tel Aviv', 'Tel Aviv', 'https://www.google.com'),
('N31', '9/5/2023 14:24', 'Desktop', 'Desktop', 'Safari', 'Mac', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15', '212.143.19.183', 'Bezeq', 'Israel', 'Tel Aviv', 'Tel Aviv', 'https://www.google.com'),
('N31', '9/8/2023 13:16', 'Desktop', 'Desktop', 'Chrome', 'Windows', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36', '212.143.19.183', 'Bezeq', 'Israel', 'Tel Aviv', 'Tel Aviv', 'https://www.google.com'),
('N31', '9/11/2023 14:30', 'Samsung Galaxy Tab S4', 'Tablet', 'Other', 'Android', 'Mozilla/5.0 (Linux; Android 9; SM-T830 Build/PPR1.180610.011; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/76.0.3809.89 Safari/537.36', '212.143.19.184', 'Bezeq', 'Israel', 'Tel Aviv', 'Tel Aviv', 'https://www.google.com'),
('N31', '9/14/2023 14:35', 'Desktop', 'Desktop', 'Safari', 'Mac', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15', '212.143.1.100', 'Bezeq', 'Israel', 'Tel Aviv', 'Tel Aviv', 'https://www.google.com'),
('N31', '9/17/2023 22:35', 'iPhone', 'iPhone', 'Safari', 'iOS', 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1', '212.143.19.184', 'Bezeq', 'Israel', 'Tel Aviv', 'Tel Aviv', 'https://www.google.com'),
('N31', '9/20/2023 19:20', 'iPhone', 'iPhone', 'Safari', 'iOS', 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1', '909.303.1.505', 'Cellcom', 'Israel', 'Tel Aviv', 'Ramat Gan', 'https://www.google.com'),
('N31', '9/23/2023 13:16', 'Desktop', 'Desktop', 'Safari', 'Mac', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15', '212.143.19.184', 'Bezeq', 'Israel', 'Tel Aviv', 'Tel Aviv', 'https://www.google.com'),
('N31', '9/26/2023 14:14', 'Desktop', 'Desktop', 'Safari', 'Mac', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.4 Safari/605.1.15', '909.303.17.777', 'Cellcom', 'Israel', 'Tel Aviv', 'Tel Aviv', 'https://www.google.com');