AI/machine learning technology is growing at a rapid pace. There is a great deal of active research & big tech is leading the way. Luckily there are also a lot of resources out there for the technologist to utilize. So many we had to cherry pick what look like the most legit & useful tools.
- Accord Framework http://accord-framework.net
- Aligned Face Dataset from Pinterest (CCO) https://www.kaggle.com/frules11/pins-face-recognition
- Amazon Reviews Dataset https://snap.stanford.edu/data/web-Amazon.html
- Apache SystemML https://systemml.apache.org
- AWS Open Data https://registry.opendata.aws
- Baidu Apolloscapes http://apolloscape.auto
- Beijing Laboratory of Intelligent Information Technology Vehicle Dataset http://iitlab.bit.edu.cn/mcislab/vehicledb
- Berkley Caffe http://caffe.berkeleyvision.org
- Berkley DeepDrive https://bdd-data.berkeley.edu
- Caltech Dataset http://www.vision.caltech.edu/html-files/archive.html
- Cats in Movies Dataset https://public.opendatasoft.com/explore/dataset/cats-in-movies/information
- Chinese Character Dataset http://www.iapr-tc11.org/mediawiki/index.php?title=Harbin_Institute_of_Technology_Opening_Recognition_Corpus_for_Chinese_Characters_(HIT-OR3C)
- Chinese Text in the Wild Dataset (CC4.0) https://ctwdataset.github.io
- CelebA Dataset (research only) http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
- Cityscapes Dataset https://www.cityscapes-dataset.com | License
- Clash of Clans User Comments Dataset (GPL 2) https://www.kaggle.com/moradnejad/clash-of-clans-50000-user-comments
- Core ML https://developer.apple.com/machine-learning
- Cornell Movie Dialogs Corpus http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html
- Deep Learning for Java https://deeplearning4j.org
- Enron Email Dataset https://www.cs.cmu.edu/~./enron
- Facebook AI Tools https://ai.facebook.com/tools
- GitHub Deep Learning https://github.com/topics/deep-learning
- GitHub Machine Learning https://github.com/topics/machine-learning
- GitHub Natural Language Processing https://github.com/topics/nlp
- GitHub Tensorflow https://github.com/topics/tensorflow
- Google Dataset Search https://toolbox.google.com/datasetsearch
- Google Facial Expression Comparison Dataset (CC0 1.0) https://ai.google/tools/datasets/google-facial-expression
- Google Landmarks Dataset https://www.kaggle.com/google/google-landmarks-dataset
- Google ML Kit https://developers.google.com/ml-kit
- Google Open Images Dataset https://ai.googleblog.com/2016/09/introducing-open-images-dataset.html
- Google Teachable Machine https://teachablemachine.withgoogle.com
- H20 AI https://www.h2o.ai
- IBM Watson Starter Kits https://cloud.ibm.com/developer/watson/starter-kits
- IMDB Movie Review Dataset http://ai.stanford.edu/~amaas/data/sentiment
- Imagenet Image Database http://image-net.org
- JVC Video Game Reviews Dataset https://www.kaggle.com/floval/jvc-game-reviews
- Kaggle Datasets https://www.kaggle.com
- Labeled Faces in the Wild http://vis-www.cs.umass.edu/lfw
- LabelMe Dataset http://labelme.csail.mit.edu/Release3.0/browserTools/php/dataset.php
- LISA Traffic Light Dataset (CC BY-NC-SA 4.0) https://www.kaggle.com/mbornoe/lisa-traffic-light-dataset
- Machine Learning Playground http://ml-playground.com
- Machine Learning Showcase https://ml-showcase.com
- Mahout https://mahout.apache.org
- Microsoft Cognitive Toolkit https://docs.microsoft.com/en-us/cognitive-toolkit
- Microsoft Distributed Machine Learning Toolkit http://www.dmtk.io
- Million Song Dataset http://millionsongdataset.com
- MLlib https://spark.apache.org/mllib
- Movie Review Datasets http://www.cs.cornell.edu/people/pabo/movie-review-data
- MovieLens Datasets https://grouplens.org/datasets/movielens
- Mushroom Dataset https://archive.ics.uci.edu/ml/datasets/mushroom
- MXNet https://mxnet.apache.org
- Mycroft https://mycroft.ai
- Natural Earth Data http://www.naturalearthdata.com/downloads
- Numenta https://numenta.com
- ONNX https://onnx.ai
- Open ML Datasets https://www.openml.org/search?type=data
- OpenCyc https://www.cyc.com/opencyc
- OpenNN http://www.opennn.net
- Oryx 2 http://oryx.io
- Oxford Robotcar Dataset (CC4.0) https://robotcar-dataset.robots.ox.ac.uk
- PredictionIO http://predictionio.apache.org
- Price of Weed Dataset https://github.com/frankbi/price-of-weed
- PyTorch https://pytorch.org
- Real & Fake Face Detection https://www.kaggle.com/ciplab/real-and-fake-face-detection
- Scikit-learn https://scikit-learn.org
- Shogun https://www.shogun-toolbox.org
- Stanford Cars Dataset http://ai.stanford.edu/~jkrause/cars/car_dataset.html
- Stanford Dogs Dataset http://vision.stanford.edu/aditya86/ImageNetDogs
- Stanford Large Network Dataset Collection https://snap.stanford.edu/data
- Stanford Sentiment Treebank https://nlp.stanford.edu/sentiment/code.html
- The Blog Authorship Corpus (research only) http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm
- The French Lexicon Project https://sites.google.com/site/frenchlexicon/results
- Theanot http://www.deeplearning.net/software/theano
- Tensorflow https://www.tensorflow.org
- TME Motorway Dataset (research only) http://cmp.felk.cvut.cz/data/motorway
- Torch http://torch.ch
- Tufts Face Database (research only) http://tdface.ece.tufts.edu
- UCI Machine Learning Repository http://archive.ics.uci.edu/ml/index.php
- UFO Reports Dataset https://github.com/planetsig/ufo-reports
- Vandal Video Game Reviews Dataset https://www.kaggle.com/floval/12-000-video-game-reviews-from-vandal
- Visual Genome http://visualgenome.org
- Wacky Corpus (CC BY-NC-SA 4.0) https://wacky.sslmit.unibo.it/doku.php?id=corpora
- Wine Quality Dataset https://archive.ics.uci.edu/ml/datasets/wine+quality
- World Bank Open Data https://data.worldbank.org
- Yale Face Database (research only) http://cvc.cs.yale.edu/cvc/projects/yalefaces/yalefaces.html
- Yelp Open Dataset (research only) https://www.yelp.com/dataset
- YouTube-8M Segments Dataset https://research.google.com/youtube8m
Big Tech R&D
- AI2 https://allenai.org
- AWS Machine Learning https://aws.amazon.com/machine-learning
- Baidu Research http://research.baidu.com/Blog
- Berkeley Artificial Intelligence Research (BAIR) https://bair.berkeley.edu
- DeepMind https://deepmind.com
- Duolingo AI https://ai.duolingo.com
- Energy.gov https://www.energy.gov/artificial-intelligence-and-machine-learning
- Facebook AI https://ai.facebook.com
- Facebook AI Research https://research.fb.com/category/facebook-ai-research
- GE Artificial Intelligence https://www.ge.com/research/technology-domains/artificial-intelligence
- Google AI https://ai.google
- Google AI & Machine Learning Products https://cloud.google.com/products/ai
- IBM Research AI https://www.research.ibm.com/artificial-intelligence
- Intel AI https://software.intel.com/en-us/ai
- Journal of Artificial Intelligence Research (JAIR) https://www.jair.org
- Microsoft Artificial Intelligence https://www.microsoft.com/en-us/research/research-area/artificial-intelligence
- OpenAI https://openai.com
- Partnership on AI https://www.partnershiponai.org
- TayTweets https://twitter.com/tayandyou Let us know if we missed your favorite AI/machine learning tool or dataset. Also be sure to check out places to educate yourself about AI/machine learning.
This data is from Vuild’s list of AI/machine learning tools & datasets. Please visit vuild.com for more.
Top comments (0)