Skip to main content
Home
  • Tutorials
    • Quality Assurance
    • Software Development
    • Machine Learning
    • Data Science
  • About Us
  • Contact
programsbuzz facebook programsbuzz twitter programsbuzz linkedin
  • Log in

Main navigation

  • Tutorials
    • Quality Assurance
    • Software Development
    • Machine Learning
    • Data Science
  • About Us
  • Contact

IMDB Movie Assignment: Reading and Checking Data

Profile picture for user akshita.goel
Written by akshita.goel on 10/29/2021 - 14:06

Subtask 1.1 Read the Movies Data

To read the data in form of CSV/excel Format firstly you need to export the Pandas Library of python. Pandas Module in Python is basically used in data science for performing various Analyses.

To following code will tell you how to import Pandas in Jupyter Notebook:

Import Panda

Now, as the Panda Library is imported, We can read the data through the following code:

pd.read_csv(dataset location...)

Error while reading data

But the above picture shows us the error while executing the code.

What is the reason behind that and How it can be resolved?

If you compare the code by the syntax above, you can clearly make out that 'c:\Users\STU\Downloads/Movie+Assignment+Data.csv' is the location of the Dataset that we are providing. We have similarly executed the code like that shown in the syntax. Still, we face this error.

REASON: The compiler considers the location that you are passing as a simple string. So if you will the code is given below, your location will be considered as a Raw String. Therefore, resulting in the execution of the code.

(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Also, you can use another method as:

(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Subtask 1.2: Inspect the Dataframe

 After reading the file through read_csv(). The very next step we are going to do is to check the data that we have. 
Some basic Operations that we did to check data, is as follows:

1)To check the number of rows and columns in a Dataset we use the shape keyword. Basically, the keyword describes the shape of the data set in for of ( Rows, Columns).

movies.shape

movies.shape

2) To find the column-wise information of a Dataset we use the info keyword. Info Keyword in Pandas is used to find the information of the Dataset concerning the name of the column, Datatype it contains, along with describing the nullability. 

movies.info

movies.info

3) describe() function in Pandas can be used to find some statistical calculations on the numerical data in the dataset. Therefore, to find the summary of numeric columns in Dataset we can use describe().

movies.describe()

describe()

Related Content
IMDB Movie Assignment: Problem Statement with Basic Instructions
IMDB Movie Assignment: Start to Analysis: Part - 1
IMDB Movie Assignment: Start to Analysis: Part - 2
  • Log in or register to post comments

Choose Your Technology

  1. Agile
  2. Apache Groovy
  3. Apache Hadoop
  4. Apache HBase
  5. Apache Spark
  6. Appium
  7. AutoIt
  8. AWS
  9. Behat
  10. Cucumber Java
  11. Cypress
  12. DBMS
  13. Drupal
  14. GitHub
  15. GitLab
  16. GoLang
  17. Gradle
  18. HTML
  19. ISTQB Foundation
  20. Java
  21. JavaScript
  22. JMeter
  23. JUnit
  24. Karate
  25. Kotlin
  26. LoadRunner
  27. matplotlib
  28. MongoDB
  29. MS SQL Server
  30. MySQL
  31. Nightwatch JS
  32. PactumJS
  33. PHP
  34. Playwright
  35. Playwright Java
  36. Playwright Python
  37. Postman
  38. Project Management
  39. Protractor
  40. PyDev
  41. Python
  42. Python NumPy
  43. Python Pandas
  44. Python Seaborn
  45. R Language
  46. REST Assured
  47. Ruby
  48. Selenide
© Copyright By iVagus Services Pvt. Ltd. 2023. All Rights Reserved.

Footer

  • Cookie Policy
  • Privacy Policy
  • Terms of Use