![]() > On Wed, at 12:40 PM, Aldcroft, Thomas wrote: > There is now a working implementation of this if anyone wants to give it whirl or suggest improvements: > On Wed, at 2:34 PM, Aldcroft, Thomas wrote: The trick to make it fast is to loop over the memory buffer of the data in the Table otherwise looping over the table elements directly is orders of magnitude slower. > The example shows how to compute the stellar mass-weighted average star formation-rate on a group-by-group basis for a large fake galaxy table generated at the beginning of the gist. When I can, I use the built-in Table aggregation features when that doesn't work, I do this. > I use this pattern all the time in my work once you see how it works it's pretty straightforward to just copy-and-paste and adapt a couple of lines for your particular problem. It's pretty fast, and it seems to require a bit less code. I have not benchmarked this against Tom's solution, but the solution in the following gist is based on running np.unique on an Astropy Table sorted on the grouping key. > I wrote up a gist of the Numpy solution to the problem that I made passing mention of in my first reply. It seems that in short, there is no neat way to do this directly in astropy, but it might be coming soon. Since the previous sentence steers this topic in the direction of the astropy development mailing list, I'll leave the discussion on an implementation to Github. ![]() Since I already had half a mind to make a PR (if there were interest and I hadn't overlooked the obvious), I've now implemented a PR which somewhat mimicks the Pandas one (in a more simplistic manner), for thoughts and comparison with Thomas's PR. (boolean) indexing with yet another column, or even using the value of the key column(s) that have been aggregated over. It's then up to the user to return a new and correct (single-row) dataframe, but easily allows fancy tricks with e.g. I guess I'm rather partial to the way Pandas handles this, where the (row-sliced) dataframe gets passed to the user function. I had a look at Thomas's PR, but it feels slightly off somehow. ![]() I hadn't thought about iterating over the groups (Andrew's gist) good to keep in mind. I realise the example is overly simplistic and can be solved (better) differently, but was indeed just meant as an example to the underlying issue. Seems like there's certainly interest in this. Hi, thanks for the great responses! In particular with a gist and a PR.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |