In the previous part, we thought about the main functionality of our bot and implemented three commands to see a list of feeds, add a new feed and remove an added feed.
In this part, we create another file that reads the feeds. Consider that we have created bot.php
to be called by Telegram server, but there is another part that should be called by a job scheduler (take cron as an example) every five minutes (or any time interval that you prefer). I use cron-job.org for doing that; it is free and easy to use.
Creating cron.php
Create a file named cron.php
in the project directory, import autoload.php
and our configuration files and create an instance of TeleBot:
<?php
use TeleBot\TeleBot;
require_once __DIR__ . '/vendor/autoload.php';
$config = require_once __DIR__ . '/config/bot.php';
$feeds = json_decode(file_get_contents(__DIR__ . '/config/feeds.json'))->feeds;
$tg = new TeleBot($config['bot_token']);
Reading the feeds
Next, we add the following code:
$links = "<b>🆕 New posts:</b>\n\n";
$linkCount = 0;
foreach ($feeds as $feed) {
$context = stream_context_create([
"http" => [
"method" => "GET",
"header" => "User-Agent: FeedReaderBot",
],
]);
$feedContent = new SimpleXMLElement(file_get_contents($feed->url, false, $context));
foreach ($feedContent->channel->item as $item) {
$links .= "<a href=\"{$item->link}\">▪️ {$item->title}</a>\n";
$linkCount++;
}
}
if ($linkCount === 0) {
die();
}
$tg->sendMessage([
'chat_id' => $config['owner_user_id'],
'text' => $links,
'parse_mode' => 'html',
'disable_web_page_preview' => true,
]);
- Dev.to prevents us from loading the feeds; we add
User-Agent
header and we do not look like the bots anymore. 😎 - We parse the XML content using the SimpleXMLElement class; this is an internal feature in PHP, as we parse JSON with ease and we do not need to install any packages.
- We create the list in a
foreach
loop. - After the first loop that iterates all of the feed objects, we send a message; using HTML format (because we have had
<a>
tag) and disabling the embedded links.
Now, enter the address of cron.php
in your web browser; you should see an elegant list:
Oh, wait! If I refresh the page, it sends the same list! It is unacceptable, because we only want the new posts, not the latest posts!
Only new posts!
Yes, you are completely right! For avoiding this problem, we must save the last link in our feed object:
{
"url": "feed-url-here",
"reader": "dev.to",
+ "last_item_url": "some-url-here"
}
Add the new field in bot.php
:
$feeds[] = [
'url' => $url,
'reader' => 'dev.to',
'last_item_url' => '',
];
We also need some changes in cron.php
:
$links = "<b>🆕 New posts:</b>\n\n";
$linkCount = 0;
foreach ($feeds as $feed) {
$context = stream_context_create([
"http" => [
"method" => "GET",
"header" => "User-Agent: FeedReaderBot",
],
]);
$feedContent = new SimpleXMLElement(file_get_contents($feed->url, false, $context));
$latestPostLink = (string) $feedContent->channel->item[0]->link;
if ($latestPostLink === $feed->last_item_url) {
break;
}
foreach ($feedContent->channel->item as $item) {
if ((string) $item->link === $feed->last_item_url) {
break;
}
$links .= "<a href=\"{$item->link}\">▪️ {$item->title}</a>\n";
$linkCount++;
}
$feed->last_item_url = $latestPostLink;
}
if ($linkCount === 0) {
die();
}
file_put_contents(__DIR__ . '/config/feeds.json', json_encode(['feeds' => $feeds]));
$tg->sendMessage([
'chat_id' => $config['owner_user_id'],
'text' => $links,
'parse_mode' => 'html',
'disable_web_page_preview' => true,
]);
If the last post link is the same as the one already stored in feeds.json
, we will skip the current feed object and check for the next. Otherwise, we add to the end of the list until we reach the last post link that is saved. Finally, we assign $latestPostLink
to the last_item_url
property of the feed object and save the $feeds
array (that involves the updated objects) in the feeds.json
outside the loop.
Add a cron-job
We do not want to enter the URL every five minutes, because if we had that much time, we would not need RSS/Atom/JSON feeds! So you should create a cron-job to send a request to our script every five minutes.
Hooray! Wait for the updates...
Thank you for following the tutorial and I hope you enjoyed it.
We used TeleBot to build this bot; you can support me by staring the repository:
https://github.com/muhammadmp97/TeleBot
And here is what we have done by now:
https://github.com/muhammadmp97/FeedReaderBot/tree/d238ac8429db6777b24a51f8618d17c2547a7254
Top comments (2)
Good Article, i just have minor notes that might come in handy when you want to scale up this app for production.
as i see, there are references to
SimpleXMLElement
class along withfile_get_contents
andfile_put_contents
functions... which is totally cool in any generic php application.but as you may know those functions take some synchronous processing behind the scenes.
i would suggest using Guzzle async approach for the network calls.. (it might show some strange behaviors but it wont hold php-fpm)
asynchronous implementation of the file system can be a bit tricky, but since we have Fibers in new versions of PHP, thats not gonna be a problem. you can do stuff like:
and at the very end Cronjobs are what all php developers are used to, but nowadays we have different event loops in php..
keep up the good work.
Thank you for reading and providing these good points.
I might improve the whole project after ending the tutorial and add some flexibility to have different type of feeds. We can have a feed reader for twitter and other social networks as well! 😍