I really enjoy regression I'd say regression was maybe one of the
first concepts that I that really helped me understand data so I enjoy a
regression. I really like data visualization I think it's a key element
for people to get across their message to people that don't understand that
well what data science is. Artificial neural networks. I'm really passionate
about neural networks because we have a lot to learn with nature so when we are
trying to mimic our our brain I think that we can do some applications with
this behavior with this biological behavior in algorithms. Data
visualization with R I love to do this. Nearest neighbor. It's the simplest but
it just gets the best results so many more times than some overblown
overworked algorithm that's just as likely to overfit as it is to make a
good fit. So structured data is more like tabular data things that you're familiar
with in Microsoft Excel format you've got rows and columns and that's called
structured data. Unstructured data is basically data that is coming from
mostly from web where it's not tabular it is not it's not in rows and columns
it's text it's sometimes it's video and audio so you would have to deploy more
sophisticated algorithms to extract data and in fact a lot of times we take
unstructured data and spend a great deal of time and effort to get some structure
out of it and then analyze it. So if you have something which fits nicely into
tables and columns and rows go ahead that's your structured data but if
you see if it's a weblog or if you're trying to get information out of
webpages and you've got a gazillion web pages
that's unstructured data that would require a little bit more effort to get
information out of it. Let me explain regression in the simplest possible
terms. If you have ever taken a cab ride a taxi ride you understand regression.
Here's how it works. The moment you sit in a cab ride in a cab you see that
there's a fixed amount there it's is $2.50 you rather the cab moves or you
get off this is what you owe to the driver the moment you step into a cab
that's a constant you have to pay that amount if you have stepped into a cab.
Then as it starts moving for every meter or hundred meters the fare increases by
certain amount so there's a there's a fraction there's a relationship between
distance and the amount you would pay above and beyond that constant. And if
you're not moving and you're stuck in traffic then every additional minute you
have to pay more so as the minutes increase your fare increases as the
distance increases your fare increases and while all this is happening you've
already paid a base fare which is the constant this is what regression is
regression tells you what the base fare is and what is the relationship between
time and the fare you have paid and the distance you have traveled and the fear
you've paid because in the absence of knowing those relationships and just
knowing how much people traveled for and how much they paid regression allows you
to compute that constant that you didn't know it was 2.50 and it would compute the
relationship between the fare and and the distance and the fare and the time.
That is regression.
you