Learn practical skills, build real-world projects, and advance your career
!pip install isbnlib
!pip install newspaper3k
!pip install goodreads_api_client
Requirement already satisfied: isbnlib in /opt/conda/lib/python3.6/site-packages (3.10.3) WARNING: You are using pip version 19.2.1, however version 20.1.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. Requirement already satisfied: newspaper3k in /opt/conda/lib/python3.6/site-packages (0.2.8) Requirement already satisfied: requests>=2.10.0 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (2.18.3) Requirement already satisfied: PyYAML>=3.11 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (5.1.1) Requirement already satisfied: feedparser>=5.2.1 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (5.2.1) Requirement already satisfied: jieba3k>=0.35.1 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (0.35.1) Requirement already satisfied: Pillow>=3.3.0 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (5.4.1) Requirement already satisfied: cssselect>=0.9.2 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (1.1.0) Requirement already satisfied: tldextract>=2.0.1 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (2.2.2) Requirement already satisfied: python-dateutil>=2.5.3 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (2.8.0) Requirement already satisfied: nltk>=3.2.1 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (3.2.4) Requirement already satisfied: feedfinder2>=0.0.4 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (0.0.4) Requirement already satisfied: lxml>=3.6.0 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (4.3.4) Requirement already satisfied: tinysegmenter==0.3 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (0.3) Requirement already satisfied: beautifulsoup4>=4.4.1 in /opt/conda/lib/python3.6/site-packages (from newspaper3k) (4.7.1) Requirement already satisfied: idna<2.6,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests>=2.10.0->newspaper3k) (2.5) Requirement already satisfied: urllib3<1.23,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from requests>=2.10.0->newspaper3k) (1.22) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests>=2.10.0->newspaper3k) (3.0.4) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests>=2.10.0->newspaper3k) (2019.6.16) Requirement already satisfied: requests-file>=1.4 in /opt/conda/lib/python3.6/site-packages (from tldextract>=2.0.1->newspaper3k) (1.5.1) Requirement already satisfied: setuptools in /opt/conda/lib/python3.6/site-packages (from tldextract>=2.0.1->newspaper3k) (41.0.1) Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.6/site-packages (from python-dateutil>=2.5.3->newspaper3k) (1.12.0) Requirement already satisfied: soupsieve>=1.2 in /opt/conda/lib/python3.6/site-packages (from beautifulsoup4>=4.4.1->newspaper3k) (1.8) WARNING: You are using pip version 19.2.1, however version 20.1.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command. Requirement already satisfied: goodreads_api_client in /opt/conda/lib/python3.6/site-packages (0.1.0.dev4) Requirement already satisfied: rauth==0.7.3 in /opt/conda/lib/python3.6/site-packages (from goodreads_api_client) (0.7.3) Requirement already satisfied: requests==2.18.3 in /opt/conda/lib/python3.6/site-packages (from goodreads_api_client) (2.18.3) Requirement already satisfied: xmltodict==0.11.0 in /opt/conda/lib/python3.6/site-packages (from goodreads_api_client) (0.11.0) Requirement already satisfied: urllib3<1.23,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from requests==2.18.3->goodreads_api_client) (1.22) Requirement already satisfied: idna<2.6,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests==2.18.3->goodreads_api_client) (2.5) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests==2.18.3->goodreads_api_client) (2019.6.16) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests==2.18.3->goodreads_api_client) (3.0.4) WARNING: You are using pip version 19.2.1, however version 20.1.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command.
import torch
import numpy as np 
import pandas as pd
import os
import seaborn as sns
import isbnlib
from newspaper import Article
import matplotlib.pyplot as plt
plt.style.use('ggplot')
from tqdm import tqdm
from progressbar import ProgressBar
import re
from scipy.cluster.vq import kmeans, vq
from pylab import plot, show
from matplotlib.lines import Line2D
import matplotlib.colors as mcolors
import goodreads_api_client as gr
from sklearn.cluster import KMeans
from sklearn import neighbors
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import warnings
warnings.filterwarnings("ignore")
df = pd.read_csv('../input/books.csv', error_bad_lines = False)

b'Skipping line 3350: expected 12 fields, saw 13\nSkipping line 4704: expected 12 fields, saw 13\nSkipping line 5879: expected 12 fields, saw 13\nSkipping line 8981: expected 12 fields, saw 13\n'
df.index = df['bookID']