Thursday, March 20, 2014
Wednesday, March 19, 2014
Saturday, March 15, 2014
Online Logistic Regression
An interesting idea..
http://jmlr.org/proceedings/papers/v23/mcmahan12/mcmahan12.pdf
For cases where I can fit the modelset in to memory (spread across a cluster) I still think H2O is going to be the best approach.
OTOH, if I can run several simultaneous regressions in one -onlline- pass over the data…
http://jmlr.org/proceedings/papers/v23/mcmahan12/mcmahan12.pdf
For cases where I can fit the modelset in to memory (spread across a cluster) I still think H2O is going to be the best approach.
OTOH, if I can run several simultaneous regressions in one -onlline- pass over the data…
Friday, March 07, 2014
Wednesday, March 05, 2014
Kinda true...
So go ahead: put "Data Scientist" on your resume. It may get you additional calls from recruiters, and maybe even a spiffy new job, where you'll be the King or Queen of a rotting whale-carcass of data. And when you talk to Master Data Management and Data Integration vendors about ways to, er, dispose of that corpse, you'll realize that the "Big Data" vendors have filled your executives' heads with sky-high expectations (and filled their inboxes with invoices worth significant amounts of money). Don't be the data scientist tasked with the crime-scene cleanup of most companies' "Big Data"—be the developer, programmer, or entrepreneur who can think, code, and create the future.
Std Error for a coefficient
In linear regression the standard error for a coefficient is a function of the errors, variance, and the sample size. The formula (see links below for an easier to read version) is:
s = sqrt[ ∑ ((e_i)^2/(n-2)) / (∑ (x_i - xbar)^2) ]
where
x_i = the ith observation
e_i = the ith error = y_i - yhat_i
n = sample size
xbar = mean of the x values
s = sqrt[ ∑ ((e_i)^2/(n-2)) / (∑ (x_i - xbar)^2) ]
where
x_i = the ith observation
e_i = the ith error = y_i - yhat_i
n = sample size
xbar = mean of the x values
Source:
Trading Knowledge
Accumulated knowledge from blogs, people, magazines, etc. over the last year.
Most importantly
- Volume is the cause for price
- Breakouts have a bigger volume and need confirmation on the second day with more than average volume afterwards
- Volume and overlayed MA20 (because there are 20 trading days / month)
- keep it simple with charts: price and volume are the most important
- concentrate on current investments, not past or future ones
- only look at the daily chart, not 30 mins, 5 mins or anything. it makes you crazy
- breaking resistance needs a follow through day, i.e. another day up
- in a bull market a normal pullback from the highs is a 50% retracement
- always look at daily, weekly, monthly to spot resistance / support points
- Watch Volume on the daily chart to compare it better with other volumes on other days
- Read news and find trading ideas 6:00 - 7:30am (Europe), for sure before the trading day starts
- Always keep some cash for short term opportunities
- You don't need to trade every day! You don't need to trade every day!
- If a company publishes earnings and the stock doesn't move much it might be that most people already own the stock. It could go down.
- Make sure you know who publishes the information, who's behind the news
- Stay away from penny stocks
- Use a stock scanner the day before trading to find potential candidates for trading
Key economic indicators
(always check for the general market direction)
- Interest rate (price of money): if interest rates are low it's good for the stock market
- Inflation: High inflation is bad for the economy. Deflation is bad as well. Most sectors will do badly
- GDP: steadily rising gdp is good for corporate earnings. 2 consecutive periods of falling gdp is called a recession
- unemployment rate
- consumer confidence (the less consumer confidence the less spending happens)
- us house prices
- Most important indicator: ISM Index (very good predictor of how the economy will be in 3 - 6 months): above 50 it is expanding, below 50 it is contracting)