If you have a multi-tenant dashboard in an OAuth-compatible BI platform and you want to restrict access to table rows based on a specific user or group, the row-level security feature in BigQuery can help you get there.
The old-school way to restrict access to specific rows involved the adding of extra columns to the table to identify who has access and JOIN it against an auxiliary table with those users/groups. For example:
SELECT t.* FROM `project.dataset.table` AS t JOIN `project.dataset.users` AS u ON t.access=u.name;
All very straightforward right? But maintaining that user’s table requires an extra effort when adding/removing them. This can be avoided by using the row-level feature available in BigQuery. You can now grant access directly based on Google groups and unify the management process. For example:
CREATE OR REPLACE ROW ACCESS POLICY policy_<customer_id> ON `project.dataset.table` GRANT TO ('group:<customer_id>@<domain>') FILTER USING (customer_id = '<customer_id>');
Also, instead of using common credentials set based on a Single-Service Account, you may use the user’s authorization via OAuth. This allows you to provide more granular-level access (security and auditing), quota control and usage information to drive the product based on this data.
I hope you are diligent and making decisions based on hard facts.
If you have more than 100 customers during the process of scaling up, this solution is not viable since there is a hard limit in BigQuery (trust me on this and don’t waste your time on asking Google to raise the threshold). Also creating 100+ rules for a single table sounds a bit off, right?
Also, what if you want to have an admin group with users that actually need access to all of the data? This use case can be tricky as well.
How to Tackle This Challenge?
The following approach filters the access based on the user’s domain name. The advantage is that it allows you to avoid creating new groups and maintaining the membership. But it also ensures that your internal group has unlimited access to the data — all just with two row-level policies.
The clear disadvantage of this methodology is that your clients need to use a company domain name to make it work effectively. This solution is not suitable for generic domains like gmail.com, yahoo.com and hotmail.com as you will be granting access to millions of users.
-- Create a table CREATE TABLE test.my_table ( id INT, name STRING, domain STRING); -- Insert some values INSERT INTO test.my_table (id, name, domain) VALUES (1, 'test1', 'example.com'), (2, 'test2', 'domain.com'); -- Restrict access to the users based on their domain name CREATE OR REPLACE ROW ACCESS POLICY restrict_per_domain ON `test.my_table` GRANT TO ('allAuthenticatedUsers') FILTER USING (domain = SUBSTR(SESSION_USER(),STRPOS(SESSION_USER(),'@')+1)); -- Grant access to an internal DoiT group CREATE OR REPLACE ROW ACCESS POLICY grant_internal_group ON `test.my_table` GRANT TO ('group:<internal_group>@<domain>') FILTER USING (1=1);
As a result:
1. [email protected], which belongs to the group allAuthenticatedUsers, can only see one row, whilst
2. [email protected], which belongs to the group [email protected], can see all the rows.
In a nutshell, you can see that permissive row-level policies overrule the more restrictive ones (the same thing that happens with the IAM permission inheritance) and although not documented explicitly by Google, you may use groups such as allAuthenticatedUsers.
To wrap up, we have seen how to take full advantage of the row-level security feature in combination with your BI tool that is compatible with OAuth. This way, you have control of what, when and who can carry out actions. Also, you are doing all this with less maintenance. Try it out!