Thoughts and Links: March 2014

To Read

http://blog.urx.com/post/79290681131/differential-equations-in-data-science

Top 10 algorithms in Data Mining

http://www.cs.uvm.edu/~icdm/algorithms/CandidateList.shtml

Online Logistic Regression

An interesting idea..
http://jmlr.org/proceedings/papers/v23/mcmahan12/mcmahan12.pdf

For cases where I can fit the modelset in to memory (spread across a cluster) I still think H2O is going to be the best approach.
OTOH, if I can run several simultaneous regressions in one -onlline- pass over the data…

DL

http://research.microsoft.com/pubs/209355/NOW-Book-Revised-Feb2014-online.pdf

Kinda true...

http://slashdot.org/topic/bi/data-science-is-dead/

So go ahead: put "Data Scientist" on your resume. It may get you additional calls from recruiters, and maybe even a spiffy new job, where you'll be the King or Queen of a rotting whale-carcass of data. And when you talk to Master Data Management and Data Integration vendors about ways to, er, dispose of that corpse, you'll realize that the "Big Data" vendors have filled your executives' heads with sky-high expectations (and filled their inboxes with invoices worth significant amounts of money). Don't be the data scientist tasked with the crime-scene cleanup of most companies' "Big Data"—be the developer, programmer, or entrepreneur who can think, code, and create the future.

Std Error for a coefficient

In linear regression the standard error for a coefficient is a function of the errors, variance, and the sample size. The formula (see links below for an easier to read version) is:

s = sqrt[ ∑ ((e_i)^2/(n-2)) / (∑ (x_i - xbar)^2) ]

where
x_i = the ith observation
e_i = the ith error = y_i - yhat_i
n = sample size
xbar = mean of the x values

Source:

http://en.wikipedia.org/wiki/Simple_line...

http://www.stat.yale.edu/Courses/1997-98…

Trading Knowledge

https://thinkery.me/nader/53172b621cb6025b0a0127c7

Accumulated knowledge from blogs, people, magazines, etc. over the last year.

Most importantly

Volume is the cause for price
Breakouts have a bigger volume and need confirmation on the second day with more than average volume afterwards
Volume and overlayed MA20 (because there are 20 trading days / month)
keep it simple with charts: price and volume are the most important
concentrate on current investments, not past or future ones
only look at the daily chart, not 30 mins, 5 mins or anything. it makes you crazy
breaking resistance needs a follow through day, i.e. another day up
in a bull market a normal pullback from the highs is a 50% retracement
always look at daily, weekly, monthly to spot resistance / support points
Watch Volume on the daily chart to compare it better with other volumes on other days
Read news and find trading ideas 6:00 - 7:30am (Europe), for sure before the trading day starts
Always keep some cash for short term opportunities
You don't need to trade every day! You don't need to trade every day!
If a company publishes earnings and the stock doesn't move much it might be that most people already own the stock. It could go down.
Make sure you know who publishes the information, who's behind the news
Stay away from penny stocks
Use a stock scanner the day before trading to find potential candidates for trading

Key economic indicators

(always check for the general market direction)

Interest rate (price of money): if interest rates are low it's good for the stock market
Inflation: High inflation is bad for the economy. Deflation is bad as well. Most sectors will do badly
GDP: steadily rising gdp is good for corporate earnings. 2 consecutive periods of falling gdp is called a recession
unemployment rate
consumer confidence (the less consumer confidence the less spending happens)
us house prices
Most important indicator: ISM Index (very good predictor of how the economy will be in 3 - 6 months): above 50 it is expanding, below 50 it is contracting)

ML in 5 pictures

http://www.denizyuret.com/2014/02/machine-learning-in-5-pictures.html

Regression stats

http://facweb.cs.depaul.edu/sjost/csc423/documents/f-test-reg.htm

http://www.stat.yale.edu/Courses/1997-98/101/anovareg.htm#mult

Thoughts and Links

Thursday, March 20, 2014

To Read

Wednesday, March 19, 2014

Top 10 algorithms in Data Mining

Saturday, March 15, 2014

Online Logistic Regression

Friday, March 07, 2014

DL

Wednesday, March 05, 2014

Kinda true...

Std Error for a coefficient

Source:

Trading Knowledge

Most importantly

Key economic indicators

Monday, March 03, 2014

ML in 5 pictures

Sunday, March 02, 2014

Regression stats

About Me

Links

Previous Posts

Archives