Please refer to the doc [Different Choices for Indexing][1], it states clearly when and why you should use **.loc, .iloc** over **.ix**, it&#39;s about explicit use case:

&gt; .ix supports mixed integer and label based access. It is primarily
&gt; label based, but will fall back to integer positional access unless
&gt; the corresponding axis is of integer type. .ix is the most general and
&gt; will support any of the inputs in .loc and .iloc. .ix also supports
&gt; floating point label schemes. .ix is exceptionally useful when dealing
&gt; with mixed positional and label based hierachical indexes.
&gt; 
&gt; However, when an axis is integer based, ONLY label based access and
&gt; not positional access is supported. Thus, in such cases, it’s usually
&gt; better to be explicit and use .iloc or .loc.


### Update 22 Mar 2017

Thanks to comment from @Alexander, **Pandas** is going to deprecate `ix` in **0.20**, details in [here](http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#deprecate-ix).

One of the strong reason behind is because mixing indexes -- positional and label (effectively using `ix`) has been a significant source of problems for users.

It is expected to migrate to use `iloc` and `loc` instead, here is a link on [how to convert code](http://pandas-docs.github.io/pandas-docs-travis/indexing.html#indexing-deprecate-ix).

  [1]: http://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing

When someone writes &quot;nit: removed whitespace&quot; on a commit, what does &quot;nit&quot; mean? I&#39;ve also seen it capitalized as if it were an abbreviation (i.e. NIT). For an example usage see [this post](https://news.ycombinator.com/item?id=7357149):

&gt; Of course there is a difference between a comment saying: &quot;Nit:
&gt; Trailing whitespace&quot; and &quot;According to Section V, Subsection VII of
&gt; the Coding Manual you should never add trailing whitespace. Please see
&gt; that you don&#39;t.&quot; or some stuff like that. The latter is a
&gt; passive-aggressive potshot, the former IMO is just a quick reminder.

Other example from [&quot;Chromium Code Reviews&quot;](https://codereview.chromium.org/9662):

&gt; Issue 9662: fix minor style nit (Closed)

EDIT: And an answer comes from [Bugzilla&#39;s review page](https://wiki.mozilla.org/Bugzilla:Review):

&gt; Sometimes the reviewer will prefix his comments with &quot;Nit:&quot;. This
&gt; means that he&#39;s just &quot;nitpicking&quot;--you don&#39;t have to fix these points,
&gt; but we&#39;d like you to.

 

What does &quot;nit&quot; mean in hacker-speak?

I have the following Rule : 



    &#39;Fno&#39; =&gt; &#39;digits:10&#39;
    &#39;Lno&#39; =&gt; &#39;min:2|max5&#39;  // this seems invalid



But How to have the Rule that 

Fno Should be a Digit with Minimum 2 Digit to Maximum 5 Digit and

Lno Should be a Digit only with Min 2 Digit

Laravel Rule Validation for Numbers

I&#39;m learning the Python pandas library. Coming from an R background, the indexing and selecting functions seem more complicated than they need to be. My understanding it that .loc() is only label based and .iloc() is only integer based. 

**Why should I ever use .loc() and .iloc() if .ix() is faster and supports integer and label access?** 

Is .ix() always better than .loc() and .iloc() since it is faster and supports integer and label access?

I'm learning the Python pandas library. Coming from an R background, the indexing and selecting functions seem more complicated than they need to be. My understanding it that .loc() is only label based and .iloc() is only integer based.
Why should I ever use .loc() and .iloc() if .ix() is faster and supports integer and label access?

I&#39;m trying to install python 3.x on an AWS EC2 instance and:

    sudo yum install python3

doesn&#39;t work:

    No package python3 available.

I&#39;ve googled around and I can&#39;t find anyone else who has this problem so I&#39;m asking here. Do I have to manually download and install it?

How do I install Python 3 on an AWS EC2 instance?

I tried to install Scrapy for Python 2.7.8 (anaconda 2.1.0) 32-bit using 

    pip install scrapy

And I got this error 

     error: Microsoft Visual C++ 10.0 is required (Unable to find vcvarsall.bat).

I have followed the solutions found in these stackover flow questions. Nothing worked. 

https://stackoverflow.com/questions/26140192/microsoft-visual-c-compiler-for-python-2-7

https://stackoverflow.com/questions/6126737/cant-find-vcvarsall-bat-file

https://stackoverflow.com/questions/2817869/error-unable-to-find-vcvarsall-bat

https://stackoverflow.com/questions/24380442/getting-error-unable-to-find-vcvarsall-bat-when-running-pip-install-numpy-o

https://stackoverflow.com/questions/19830942/pip-install-gives-error-unable-to-find-vcvarsall-bat

https://stackoverflow.com/questions/6551724/how-do-i-point-easy-install-to-vcvarsall-bat/8705722#8705722

https://stackoverflow.com/questions/24783176/pip-install-mysql-python-returns-unable-to-find-vcvarsall-bat

This is the error, and a few lines above and below it:

    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\readme.txt
    -&gt; build\lib.win32-3.4\lxml\isoschematron\resources\xsl\iso-schematron-xslt1

    running build_ext

    building &#39;lxml.etree&#39; extension

    C:\Python34\lib\distutils\dist.py:260: UserWarning: Unknown distribution opt
    ion: &#39;bugtrack_url&#39;

      warnings.warn(msg)

    error: Microsoft Visual C++ 10.0 is required (Unable to find vcvarsall.bat).


    ----------------------------------------
    Command &quot;C:\Python34\python.exe -c &quot;import setuptools, tokenize;__file__=&#39;C:
    \\Users\\San\\AppData\\Local\\Temp\\pip-build-wp6ei6r9\\lxml\\setup.py&#39;;exec(com
    pile(getattr(tokenize, &#39;open&#39;, open)(__file__).read().replace(&#39;\r\n&#39;, &#39;\n&#39;), __f
    ile__, &#39;exec&#39;))&quot; install --record C:\Users\San\AppData\Local\Temp\pip-kfkzr_67-r
    ecord\install-record.txt --single-version-externally-managed --compile&quot; failed w
    ith error code 1 in C:\Users\San\AppData\Local\Temp\pip-build-wp6ei6r9\lxml


---

I have both Microsoft Visual Studio 12.0, and Microsoft visual C++ compiler package for Python 2.7, both of which have the vcvarsall.bat file. 

---

I have a system variable that is called &#39;VS120COMNTOOLS&#39; and is its path is set to 

    C:\Program Files\Microsoft Visual Studio 12.0\Common7\Tools\

---

I also added both paths to my environment variables. I&#39;ve also tried just adding one, and then the other. My Path looks like this

    C:\Program Files\Java\jdk1.7.0_25\bin;\Python27;\Python2\python.exe;C:\Python27\Scripts\;C:\Users\San\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\;C:\Program Files\Microsoft Visual Studio 12.0\VC\;

---

I also updated by my setup tools (I think to version 8), which should autodetect Microsoft Visual C++ Compiler for Python 2.7. However, I&#39;m still getting the same error. 

---

I have also tried using 

    easy_install scrapy

And I get this error

    error: Setup script exited with error: Microsoft Visual C++ 10.0 is required (Un
able to find vcvarsall.bat).

---

I also have the following in my registry


    HKEY_LOCAL_MACHINE\Software\Microsoft\VisualStudio\9.0\Setup\VC\ProductDir
    HKEY_LOCAL_MACHINE\Software\Microsoft\VisualStudio\12.0\Setup\VC\ProductDir



Python Pip install Error: Unable to find vcvarsall.bat. Tried all solutions

I&#39;m creating very simple charts with matplotlib / pylab Python module. The letter &quot;y&quot; that labels the Y axis is on its side. You would expect this if the label was longer, such as a word, so as not to extend the outside of the graph to the left too much. But for a one letter label, this doesn&#39;t make sense, the label should be upright. My searches have come up blank. How can I print the &quot;y&quot; horizontally?

How to print Y axis label horizontally in a matplotlib / pylab chart?

When selecting a sub dataframe from a parent dataframe, I noticed that some programmers make a copy of the data frame using the `.copy()` method. For example,

```python
X = my_dataframe[features_list].copy()
```

...instead of just

```python
X = my_dataframe[features_list]
```

 Why are they making a copy of the data frame? What will happen if I don&#39;t make a copy?   



why should I make a copy of a data frame in pandas

What is the idiomatic python way to hide traceback errors unless a verbose or debug flag is set? 

Example code:

    their_md5 = &#39;c38f03d2b7160f891fc36ec776ca4685&#39;
    my_md5 = &#39;c64e53bbb108a1c65e31eb4d1bb8e3b7&#39; 
    if their_md5 != my_md5:
        raise ValueError(&#39;md5 sum does not match!&#39;)

Existing output now, but only desired when called with `foo.py --debug`:

    Traceback (most recent call last):
      File &quot;b:\code\apt\apt.py&quot;, line 1647, in &lt;module&gt;
        __main__.__dict__[command] (packages)
      File &quot;b:\code\apt\apt.py&quot;, line 399, in md5
        raise ValueError(&#39;md5 sum does not match!&#39;)
    ValueError: md5 sum does not match!

Desired normal output:

    ValueError: md5 sum does not match!

Here&#39;s a test script: https://gist.github.com/maphew/e3a75c147cca98019cd8


Hide traceback unless a debug flag is set

I&#39;m trying to merge a (Pandas 14.1) dataframe and a series. The series should form a new column, with some NAs (since the index values of the series are a subset of the index values of the dataframe).

This works for a toy example, but not with my data (detailed below).

Example:

    import pandas as pd
    import numpy as np

    df1 = pd.DataFrame(np.random.randn(6, 4), columns=[&#39;A&#39;, &#39;B&#39;, &#39;C&#39;, &#39;D&#39;], index=pd.date_range(&#39;1/1/2011&#39;, periods=6, freq=&#39;D&#39;))
    df1
    
    A	B	C	D
    2011-01-01	-0.487926	0.439190	0.194810	0.333896
    2011-01-02	1.708024	0.237587	-0.958100	1.418285
    2011-01-03	-1.228805	1.266068	-1.755050	-1.476395
    2011-01-04	-0.554705	1.342504	0.245934	0.955521
    2011-01-05	-0.351260	-0.798270	0.820535	-0.597322
    2011-01-06	0.132924	0.501027	-1.139487	1.107873
    
    s1 = pd.Series(np.random.randn(3), name=&#39;foo&#39;, index=pd.date_range(&#39;1/1/2011&#39;, periods=3, freq=&#39;2D&#39;))
    s1
    
    2011-01-01   -1.660578
    2011-01-03   -0.209688
    2011-01-05    0.546146
    Freq: 2D, Name: foo, dtype: float64
    
    pd.concat([df1, s1],axis=1)
    
    A	B	C	D	foo
    2011-01-01	-0.487926	0.439190	0.194810	0.333896	-1.660578
    2011-01-02	1.708024	0.237587	-0.958100	1.418285	NaN
    2011-01-03	-1.228805	1.266068	-1.755050	-1.476395	-0.209688
    2011-01-04	-0.554705	1.342504	0.245934	0.955521	NaN
    2011-01-05	-0.351260	-0.798270	0.820535	-0.597322	0.546146
    2011-01-06	0.132924	0.501027	-1.139487	1.107873	NaN


The situation with the data (see below) seems basically identical -  concatting a series with a DatetimeIndex whose values are a subset of the dataframe&#39;s. But it gives the ValueError in the title (blah1 = (5, 286) blah2 = (5, 276) ). Why doesn&#39;t it work?:

    In[187]: df.head()
    Out[188]:
    high	low	loc_h	loc_l
    time				
    2014-01-01 17:00:00	1.376235	1.375945	1.376235	1.375945
    2014-01-01 17:01:00	1.376005	1.375775	NaN	NaN
    2014-01-01 17:02:00	1.375795	1.375445	NaN	1.375445
    2014-01-01 17:03:00	1.375625	1.375515	NaN	NaN
    2014-01-01 17:04:00	1.375585	1.375585	NaN	NaN
    In [186]: df.index
    Out[186]:
    &lt;class &#39;pandas.tseries.index.DatetimeIndex&#39;&gt;
    [2014-01-01 17:00:00, ..., 2014-01-01 21:30:00]
    Length: 271, Freq: None, Timezone: None
    
    In [189]: hl.head()
    Out[189]:
    2014-01-01 17:00:00    1.376090
    2014-01-01 17:02:00    1.375445
    2014-01-01 17:05:00    1.376195
    2014-01-01 17:10:00    1.375385
    2014-01-01 17:12:00    1.376115
    dtype: float64
    
    In [187]:hl.index
    Out[187]:
    &lt;class &#39;pandas.tseries.index.DatetimeIndex&#39;&gt;
    [2014-01-01 17:00:00, ..., 2014-01-01 21:30:00]
    Length: 89, Freq: None, Timezone: None
    
    In: pd.concat([df, hl], axis=1)
    Out: [stack trace] ValueError: Shape of passed values is (5, 286), indices imply (5, 276)

Pandas concat: ValueError: Shape of passed values is blah, indices imply blah2

I would like to know if there is someway of replacing all DataFrame negative numbers by zeros?

How to replace negative numbers in Pandas Data Frame by zero

This would be useful so I know how many unique groups I have to perform calculations on.  Thank you.

Suppose groupby object is called `dfgroup`. 

How to get number of groups in a groupby object in pandas?

I want to group my dataframe by two columns and then sort the aggregated results within the groups.

    In [167]: df

    Out[167]:
       count     job source
    0      2   sales      A
    1      4   sales      B
    2      6   sales      C
    3      3   sales      D
    4      7   sales      E
    5      5  market      A
    6      3  market      B
    7      2  market      C
    8      4  market      D
    9      1  market      E


    In [168]: df.groupby([&#39;job&#39;,&#39;source&#39;]).agg({&#39;count&#39;:sum})

    Out[168]:
                   count
    job    source       
    market A           5
           B           3
           C           2
           D           4
           E           1
    sales  A           2
           B           4
           C           6
           D           3
           E           7


I would now like to sort the count column in descending order within each of the groups. And then take only the top three rows. To get something like:

                    count
    job     source
    market  A           5
            D           4
            B           3
    sales   E           7
            C           6
            B           4

Content Type	Original Author	Original Content on Stackoverflow
Question	megashigger	View Question on Stackoverflow
Solution 1 - Python	Anzel	View Answer on Stackoverflow

Is .ix() always better than .loc() and .iloc() since it is faster and supports integer and label access?

Python Problem Overview

Python Solutions

Solution 1 - Python

Update 22 Mar 2017

Laravel Rule Validation for Numbers

What does "nit" mean in hacker-speak?

Attributions