Few weeks ago I was working on the new report of customer's game. The platform that's providing campaigns reports don't have public API to generate the campaign reports on request with any kind of developer key to access.
But it's possible to request such reports by using their dashboard. I know it's a bit odd to rely on UI for downloading such reports but it was the only one way to get access for customer's valuable data.
Lets define requirements for this idea:
- should be standalone python script for easy execution and integration with existing ETL libraries
- should not require extra software on the server except the docker package(that's pretty flexible)
Now we are ready to give a try and build something runnable. In this post going to use specific libraries to get access to the docker process because of specific version of installed package in CentOS(in my example).
My requirements.txt:
docker==2.1.0
splinter==0.7.7
timeout-decorator==0.3.3
splinter is nice library to wrap browser drivers on automating anything on the pages.
Let's define the class for running Google Chrome
container, later we will use before to get access to the page via splinter
library.
class _ChromeContainer:
'''
_ChromeContainer should handle run of chrome docker container
on background.
Requires to have docker service on machine to pull images
and run images.
'''
def __init__(self):
self.__image_name = "selenium/standalone-chrome:3.10.0"
self.__client = docker.from_env()
def run(self):
'''
Startup docker container with chromedriver, waiting for running state
'''
client = self.__client
self.container = client.containers.run(self.__image_name,
detach=True,
ports={'4444/tcp': None})
@timeout_decorator.timeout(120)
def waiting_up(client: docker.client.DockerClient, container):
while True:
container.reload()
if container.status == "running":
break
time.sleep(1)
waiting_up(client, self.container)
def quit(self):
'''
kills and deletes named container
'''
self.container.kill()
@property
def public_port(self):
container = self.__chrome_container.container
return container.attrs["NetworkSettings"]["Ports"]["4444/tcp"][0]["HostPort"]
Now we are ready to use splinter
and ahd _ChromeContainer
to automate your task.
import timeout_decorator
import docker
from splinter import Browser
class Worker:
def __init__(self):
self.__chrome_container = _ChromeContainer()
def process(self):
self.__chrome_container.run()
self.__web_client = Browser('remote',
url="http://127.0.0.1:{}/wd/hub".format(self.__chrome_container.public_port),
browser='chrome')
# Example for login request:
try:
self.__login()
finally:
self.__web_client.quit()
self.__chrome_container.quit()
def __login(self):
self.__web_client.visit("http://www.example.com/login")
self.__web_client.fill('developer_session[email]', 'EXAMPLE_USERNAME')
self.__web_client.fill('developer_session[password]', 'EXAMPLE_PASSWORD')
button = self.__web_client.find_by_id('developer_session_submit')
button.click()
It's an example and it would possible to extend by the similar steps like __login
in your Worker
class.
Thank you for reading! :)
Top comments (0)