site banner

Small-Scale Question Sunday for April 21, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

1
Jump in the discussion.

No email address required.

Does anybody like programming?

I have been hired as a sole and lead Python developer in a company. But my Python experience is mostly on Numpy, if anybody has some tips? It would be very appreciated!

Partly a response, partly hijacking this to ask a question of my own to everyone else: what are you using as a editor/compiler?

I programmed exclusively in Java for years, but my new boss wanted programs in Python so I've been doing that this past year. Using Eclipse, which is wonderful as an editor, since it lets me organize everything and highlights typos that I make and stuff.

Aside a whole lot of friction involving different conventions and abilities, I was annoyed that all of the Python editors people recommended seemed way less functional until I discovered that I can program Python in Eclipse if I do the right stuff. So I've been doing that.

I'm not sure what the general consensus is, because I'm mostly self-taught and program on my own, making mathematical models for research purposes that nobody else has to use or collaborate with, so I've probably got all sorts of weird habits that would make more sophisticated programmers cringe. So I can't tell how much of this is objective and how much is just me being used to Eclipse for so many years and having little experience with anything else. But I tentatively recommend looking into PyDev for Eclipse, because in my opinion it's nice.

Yeah, seconding both prongs, here: a) IDEs are important and b) Python IDEs near-universally suck. If you're in the Java sphere before, PyCharm is kinda the Intellij-for-Python, for better and worse, and there's a large faction that loves VSCode for eating all of their RAM handling multi-language projects reasonably, but for the love of god don't try to build class-ful python in IDLE.

((I'll generally advocate PyCharm for new programmers, as annoying some of the Intellijisms can be, but if you're more acclimatized to and have already set up Eclipse it's definitely not worth swapping.))

VSCode for eating all of their RAM

I don't know where this myth came from - usually bad extensions are the memory hogs.

this is my VSCode at the moment - 3gb ram - way less than my browsers. And 32 GB ram was baseline dev computer 8 years ago.

Image Commit (KB) Working Set (KB)
Code.exe 229,192 203,380
Code.exe 196,436 181,052
Code.exe 181,540 158,272
Code.exe 146,848 143,044
Code.exe 170,172 146,452
Code.exe 158,840 142,484
Code.exe 116,608 114,484
Code.exe 149,196 117,328
Code.exe 112,392 98,688
Code.exe 90,580 92,056
Code.exe 86,820 97,276
Code.exe 1,423,064 86,692
Code.exe 73,020 76,104
Code.exe 73,356 75,808
Code.exe 56,140 61,656
Code.exe 56,864 59,304
Code.exe 50,748 41,668
Code.exe 39,788 44,236
Code.exe 37,548 43,608
Code.exe 23,656 22,300
Code.exe 22,832 21,832
Code.exe 21,308 20,340
Code.exe 21,208 20,296
Code.exe 20,956 20,192
Code.exe 20,924 22,388
Code.exe 21,160 23,428
Code.exe 17,980 16,276
Code.exe 17,992 16,244
Code.exe 18,004 15,824
Code.exe 15,096 21,176
Code.exe 11,004 9,376

Dude, 3 GB of RAM usage is in no way acceptable. You're saying "I don't know where this myth came from" while providing evidence that it's not a myth at all. VSCode is a memory hog, like all Electron apps.

Since when is 3GB memory hogging?

Since always. Even in the modern day when a system will easily have 16-32 GB of memory, that's 10% (or 20%) of the entire system! It's not remotely acceptable for a single app to take up that much memory.

By comparison, Sublime Text (which is very much in the same ballpark in terms of features) takes up 998 MB including memory shared with other processes. It uses just 210 MB discounting the shared memory!! That's the sort of performance you can get when software is written by people who give a shit, not lazy devs who go "eh Electron is fine, people have lots of RAM these days".

Since always. Even in the modern day when a system will easily have 16-32 GB of memory, that's 10% (or 20%) of the entire system! It's not remotely acceptable for a single app to take up that much memory.

Disagree. RAM exists to be used. There are lots of performance reasons for trading off memory utilization with CPU processing and storage IO, and a complex program which is a primary use case for a PC should make those tradeoffs in favor of more RAM utilization unless operating in a memory-constrained environment.

RAM exists to be used, and the app developer should humbly realize that the user (this is about the user, right?) may have a use for that RAM and therefore optimize the software.

More comments

Since always. Even in the modern day when a system will easily have 16-32 GB of memory, that's 10% (or 20%) of the entire system! It's not remotely acceptable for a single app to take up that much memory.

Except VSCode and Brave and DBeaver are roughly 100% of what I do on a machine while developing.

So it follows that app being developed eats roughtly 0% of memory?

Look, if you're content for apps to hog memory because you use them exclusively I can't really stop you. Go nuts. But to me it's not an acceptable level of performance, because I use my computer for many things and I expect it to be able to support them all at once.

Ten years ago a brand-new processor would have been the Haswell- or Broadwell-era, and while you could get machines that could hold 32GB RAM, the H81 chipset only supported up to 16GB, going to 32GB would not have been standard, and it'd probably cost you upwards of 250 USD in RAM alone.

But more centrally, VSCode's linter and intellisense implementation is perfectly fine for mid-sized projects without a boatload of dependencies in certain languages. Get outside of those bounds, and its RAM usage can skyrocket. Python tends to get it hard (as does Java, tbf) because of popular libraries with massive and somewhat circular dependency graphs, but I've seen large C++ projects go absolutely tango uniform, with upwards of 10GB.

Yes, it is usually an extension problem, but given that you'll end up needing to install a few extensions for almost every language you work with just to get them compiling (nevermind debugging!), and that it's often even Microsoft-provided extensions (both vscode-cpptools and vscode-python have bitten me, personally) , that doesn't actually help a lot. Yes, you can solve it by finding the extension and disabling it, and sometimes there's even alternative extensions for the same task that do work.

The normal case isn't much worse, and sometimes is better, than alternatives like IntelliJ/PyCharm. But the worst cases are atrocious, and they're not just things hitting some rando on a github issue with some weird outlier use case.

going to 32GB would not have been standard, and it'd probably cost you upwards of 250 USD in RAM alone.

My PC built in 2016 with skylake (2015) had 64GB ram. My assembled in 2010 had 32. And with developer salaries being what it is - it was always affordable even in Eastern Europe.

32GB was possible on Sandy Bridge processors (technically 2011), but mid-range Westmere and Nehalim processors only supported 16GB(ish) for most of the consumer market, and even the high-end Bloomfield capped at 24GB. I'm not saying you didn't do it -- I've got a couple Xeon systems from that era floating around that could have -- but it was absolutely not a standard use case.

A more normal midrange system would be closer to 4GB, with 8GB as the splurge. You'd probably end up spending over 400 USD in RAM alone, plus needing to spec up your motherboard to support it (thanks, Intel for the fucky memory controller decision).

32GB was possible on Sandy Bridge processors (technically 2011),

So it probably was sandy bridge. It wasn't xeon with certainty. Too many years. I remember having core 2 duo 2006 or 7 with 8GB, I remember that the PC I built in 2016 had 64 (which I still hasn't changed, the performance growth in everything but the GPUs have been pathetic), and I remember that it replaced a PC with 32 - so it probably was early 2011. Also possible I build one in 2010 with 16 and then one in 2012 with 32.

Anyway RAM was peanuts compared to the payroll for developers so it didn't make any sense to not pump their workstations.

Wth, my 2018 pc only had 16 until I recently upgraded to 32. I think you were in the top fraction of a percent of users.

Do you know if there's a way to.... I'm not even sure what the right language is here.... put different classes in different .py files, or at least different tabs, without running into recursive dependency issues.

Like, in Java, I can make a World class that contains a population from the Agent class, and models an epidemic going through them, and the Agents have a bunch of methods internally regarding how they function as they get infected and recover and stuff. And if I pass a copy of the main World to each Agent when it's created, then when they do stuff in their methods they can call back up to the World, usually for counting purposes, they say "hey I got infected, increment the total infection counter" or "hey someone was going to infect me but I'm already infected, increment the redundant infection counter".

As far as I can tell, in Python I can't do that nicely. If the World class imports Agent, then the Agent class can't import World. I can resolve this by defining both classes in the same .py file, but then all my code is arranged 1-dimensionally and I have to scroll through tons of stuff to find what I'm looking for (or use ctlr F). Whereas in Java each class has its own tab, I can open or close or switch to, so well-behaved ones that I'm not working on don't take up space or get in my way. I'm not sure if this is a Python issue or just a Eclipse issue. Is there a way to split a .py file into multiple tabs so I can organize better?

This sounds less like a Python problem and more like a "you need to learn how to architect projects and write clean maintainable code" problem. You know.. the Engineering part of Software Engineering..

Also, why are you importing Agent or World into each other at all? The World needs to be a Singleton that has-many agents. They should be declared in different files and a third file should manage both of their interactions.

I'd caution that :

  1. Python's support for the singleton pattern is kinda jank, due to lack of first-class support for private constructors or access modifiers.
  2. While there's a lot of arguments in favor of the singleton pattern with an interaction controller for bigcorp work, in small businesses it can be a temptation with serious tradeoffs. Refactoring (whether to add an intermediate object between World and Agent, or if you end up needing multiple World objects such as for a fictional context) can be nightmarish in Python, even if all the interaction logic is properly contained. And it probably won't be properly contained: marketing and customers can end up demanding bizarre requirements on near-zero notice that can require information from multiple different singletons, and if you end up hiring (or taking interns!) as a small business rather than at the FAANG level, those people (and I was one of them once!) will often break around the interaction controller unless aggressively managed.

The World needs to be a Singleton

Eppur si muove!

I'm... not very good with Python, but my understanding, a toy example would be :

main,py:

import agent
import world

agentCount = 20
infectionCount = 25
world = world.World()
print("Starting...")
for i in range(agentCount):
    world.addAgent(agent.Agent(world))

for i in range(infectionCount):
    world.infectRandomAgent()

print("Total Infections :" + str(world.totalInfections))
print("Total Redundant Infections :" + str(world.redundantInfections))
for i in range(agentCount):
    print("Agent #" + str(i) + " Infections:" + str(world.knownAgents[i].countedInfections))

world,py:

import random

class World:
    knownAgents = list()
    totalInfections = 0
    redundantInfections = 0

    def addAgent(self, newAgent):
        self.knownAgents.append(newAgent)

    def infectRandomAgent(self):
        random.choice(self.knownAgents).incrementInfection()

agent,py:

class Agent:
    wasInfected = False
    countedInfections = 0

    def __init__(self, ownerWorld):
        self.world = ownerWorld

    def incrementInfection(self):
        self.world.totalInfections += 1
        if self.wasInfected:
            self.world.redundantInfections += 1
        self.wasInfected = True;
        self.countedInfections += 1

Note that if you're using raw python3.exe or a basic IDE like IDLE, all three files will need to be in the same folder, or you have to treat them like modules. Better IDEs like PyCharm will handle most of this for you, though I'd recommend experimenting before futzing with it a lot.

__init__ is a python builtin capability that's pretty equivalent to Java Constructors. The first argument for any class function will act as a reference to the instance of that class being called for that function, regardless of name -- do be careful getting a convention for that early and often, or it'll drive you up the walls. self is popular in pythonic circles, but I've seen a surprisingly large project that took the convention of this<className>, probably downstream of java or C# devs.

Only your main simulation file really should need to import the files that make up the actual objects. The class objects themselves don't need to know about each other, even if they're calling methods or fields specific to the other class, because that gets looked up during live runtime operations.

(edit: specifically, the class calling the constructor for an instance of an object needs to import that object. You could have, and it would probably be cleaner, to import Agent within world.py and not from within main.py, and do the agent constructor in the form :

    def addAgent(self):
        self.knownAgents.append(agent.Agent(self))

But I've been burned before in python environments where I ended up with my class imports spread throughout for hundred places and it being a nightmare to refactor or rename or handle versioning, so my preference for non-giant projects is to centralize imports, and for giant python projects you probably should be breaking it into modules.

I've been doing it like that, where they're all together and reference each other, it's just that then when Agent has 15 methods because some of them are experimental variations on each other or niche things I wanted to do to see what would happen, then I make another class for graphing scatter plots, and I've got a bunch of methods for (Make a world, then modifier the parameters according to X, then execute Y, then graph the results, then repeat that N times) that would be nice to stick in their own class somewhere, and then I've got a bunch of useful static methods that do stuff like load and save data to CSVs that would be nice to have in their own class for organization purposes. And if I just lay them out linearly (which I mostly have, with a few rare exceptions that definitely have 0 recursive dependencies and I actually have moved them to their own .py file) then I have literally 2000 lines of code I have to scroll up and down just to find the right class whenever I want to check to see what the name of the method I want to call is or something, and then scroll back down to find the spot I'm working on.

There's nothing like the partial class concept from C#, though I agree it would be really nice if there were.

You can kinda fake it by exploiting the heck of out inheritance, in a couple different ways, depending on what level of composition you're aiming to be able to do. If you want selective import of behaviors (and to avoid the diamond inheritance problem, mostly), you can do something like :

agentInfectionLogic,py:

wasInfected = False
countedInfections = 0

def incrementInfection(self):
    self.world.totalInfections += 1
    if self.wasInfected:
        self.world.redundantInfections += 1
    self.wasInfected = True
    self.countedInfections += 1

def infectedCount(self):
    return self.countedInfections

agentFileLogic,py:

def loadInfectionInfo(self):
    temploadInfections = 20
    for x in range(temploadInfections):
        self.incrementInfection()
    # do an actual file load here.

def saveInfectionInfo(self):
    tempfile = self.infectedCount
    # save an actual file here.

agent,py:

class Agent:
    from agentInfectionLogic import infectedCount, incrementInfection, countedInfections, wasInfected
    from agentFileLogic import saveInfectionInfo, loadInfectionInfo

    def __init__(self, ownerWorld):
        self.world = ownerWorld

And then calls like world.knownAgents[0].loadInfectionInfo() or world.infectRandomAgent() would work as normal, and you can even swap between different experimental forms by having from agentInfectionLogic import infectedCount, incrementInfection, countedInfections, wasInfected or from testAgentInfectionLogic import infectedCount, incrementInfection, countedInfections, wasInfected (or even a mix-and-match between the two).

Agent.py has to know about what's going on, but to everywhere else, anything imported into agent.py looks identical to as if it were coded into that file or class. Eventually this turns into a full module, where the __init__.py file holds the glue and then you have better names for your actual logic .pys, but when that makes sense depends a lot on the scale of your project.

Bump, Please someone answer this. I have the exact same issue and both gpt4 and google are not helping.

The term you’re looking for is circular dependency. That should hopefully help you on your Google quest.