Introduction

Managing huge amounts of structured and unstructured data is crucial to the success of every company that needs systematic organization and governance to ensure their data is of high quality and suitable for analytics and business intelligence applications. Although the key aspects of big data can be summarized to the popular 3 Vs of Volume, Velocity, and Variety, there are also other key questions that every company needs to ask when choosing the proper process they need to store and transform their data.

Big Data Aspects

Volume: How big is the incoming data stream and how much storage is needed?

Velocity: Refers to…


Designer Genomes

Retrieved from genome.gov

Over the last few decades, thanks to the contributions of the scientific community, the meaning of life has changed dramatically and is not the same as we used to know it. Synthetic genomics is an emerging science topic. Scientists use methods of genetic modification or artificial gene synthesis to create new DNA or entire lifeforms. In 2010, they created the first synthetic living cell and just a few months ago in March 2017, they designed six chromosomes of a far more complex organism. It was Saccharomyces Cerevisiae or the baker’s yeast.

These Sc2.0 chromosomes were substantially different…


A beginners journey to data science from zero to hero!

ARTWORK: TAMAR COHEN, ANDREW J BUBOLTZ, 2011, SILK SCREEN ON A PAGE FROM A HIGH SCHOOL YEARBOOK

There’s a lot of data science materials to read and study and keeping track of them is getting harder by the day. In this vast ocean of books, videos, blogs, or to sum it up the data, finding what you have to read can be a cumbersome task!

This post assumes you already have beginner level skills in python. Although there’s no specific order to the reading of these materials, I’d still recommend you to go with them in the order I have outlined.


Are you stuck in writing your math or chemical equations? Are you tired of checking the markdown every time? Stop torturing the shift/ctrl +enter keys right now!

Whenever I try to write an equation in a markdown format whether in Jupyter or R-Studio I get confused. What are all these symbols?! How should I memorize them. There should be an easier way to put simple (or complex) formulas in markdown format, right?

I present you the red / blue pill theory:

To pill or not to pill!

If you want to take a life of harsh knowledge, desperate freedom, and…


Scientist have eliminated HIV in mice using a genome editing method called CRISPR.

CRISPR (Clustered regularly interspaced short palindromic repeats) is a way by bacteria to fight the invasion of a foreign DNA molecules. Scientist have developed a way to use this system to perform gene editing with a scissor like precision and since Human Immunodeficiency Virus resides in the host’s DNA, CRISPR can be a great method to fight HIV. …


Data Science (DS) can be summed up as the combination of statistical analysis and programming skills to analyze high volume data sets and provide meaningful predictions and results. This requires the implementation of many skills like statistics, data mining, regression, classification, predictive modeling, and data visualization, etc.

Gathering the data is only the beginning of this practice since most raw data without proper filtering, sorting, and cleaning are useless. Many data require the data scientist’s input to merge, remove, join, and cut out specific parts of that data set to further prepare the set for the specific analysis/modeling.

DS became…

Yashar Mansouri

✔ Data Scientist / Engineer. Coffee➡Code➡Data➡ML➡Life

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store