I've been building telegram bots for some time now and with GDPR right around the corner I've had to turn a few off whilst I address any impact it's going to have since I feel that I don't have enough information to make a decision as to whether they are safe to continue running or not. In this post I'm looking for feedback on approaches and any additional information that can help me to deduce the compliance or changes that need to be made.
A set of bots I'm particularly concerned with are ones that I have built to manage large Telegram supergroups. For anyone familiar with group butler is very similar with features such as:
- Removing bad links
- Removing profanity
- Muting or banning users
- Custom commands to display group information.
Most recently I've been asked by a client if I could incorporate analytics into the bot so they can measure things such as:
- Messages per day
- Number of banned users Etc.
The typical things I would expect a company to want to learn about their community. It's probably worth mentioning this particular group has over 12,000 users.
Whilst I began to build out the analytics module it struck me that this is not something outside of the scope of GDPR just because it's a bot sitting in a group acting on user interaction.
I decided to take a look into the rules and realised that actually it's not just the analytics which could pose a problem here. When an admin bans a user they give a time period to set the ban to expire and invites the user back to the chat if they wish. This requires the storage of a users Telegram ID. For anyone unfamiliar this is the data that a bot can generally see when interacting with a user on telegram:
The id above is unique to every user. The username is also unique and chosen by the user whenever they wish. It can also be changed at any time just like the first name.
I currently store the id along with the time frame for the ban and reason for the ban. I process this later to invite the users back.
When reading up on GDPR rules the telegram id is a piece of information that is unique to a person and along with other information can be used to discover their identify. Strike 1?
If I then delve into the requirements for analytics, it becomes a complete minefield. Whilst I could record an event, it appears the usage of any unique user ID being sent too can cause a lot of GDPR headaches.
Whilst I could go ahead and simply use timestamps, when it comes to attributing commands used, message counts etc without tieing these to some identifier for a particular user there won't be much hope for understanding user advocacy (even if I'm not bothered on knowing who exactly this is).
I had planned on using something like amplitude.com for analytics processing and whilst they are "GDPR ready" I'm going to be confident in saying that protects their offering and does little to help my side of the coin.
If I assume based on the above, that GDPR is most definitely relevant here, then how do I go about the following scenarios:
Informing users about the usage of their data. This is a chat app and unless a user interacts with the bot, it can not send them direct messages first to inform them.
Handling requests to be forgotten. Right now I could build out a command to do this and process the request quite easily, but once again is that even suitable.
Handling consents. This is one of the hardest parts. I've seen the blanket emails going around "We've updated our policy.. ya cool?😥". Translating this to a bot in a messaging app is probably going to be a lot of work, but right now it kind of feels like that's exactly what I'm going to have to do. Once again though, how the user is told to do this when the bot can not initiate a conversation privately first is going to be a problem.
Whilst the above scenarios for be handled with messages spamming into the chat every so often informing users of their rights and options, it's a actually precisely not what some clients are after. I'm also thinking a do not track option is easy enough to build out but does it's cover enough.
I've spent a good few hours now looking through Google and the best practice documents out there and I'm quite amazed for this has not been discussed before. Infact, right now I can list a number of bot accounts and services specifically used on Telegram which could be in for a lot more issues than I feel I'm up against.
Any thoughts on this one?
Top comments (3)
I have a lot of thoughts on this one... am I lawyer - but still haven't worked out how to advise my clients. Your post was very enlightening for me - a Luddite ... who is trying to work out how to apply the GDPR and other laws to Telegram bots. Would be happy to have a skype call to brainstorm, if you like.
On the analytics side of the issue, I believe any bot should be safe as long as it produces aggregated data readable to humans. Internally, I'd hash user IDs which are probably the only possible way if linking a particular user to identifying information. Hear hashes user IDs would stay hashes if they need to hit permanent storage at some point.