Browsed by
Category: Technical

Doordash Principal Engineer on Microservices

Doordash Principal Engineer on Microservices

A reasonable good, simple discussion on the pros/cons of monoliths and microservices.

Some interesting comments:

  • Conway’s law – the structure of a system reflects the structure of the organization that makes it.
  • Microservices have their issues because they are a technical solution to an organizational problem – trying to solve when a team gets too big.

Here are the links referenced:

The gritty world of retro game analysis

The gritty world of retro game analysis

The world has gotten very familiar to retro hardware re-creations, game emulation, re-releases, speed runs, creating new games for old platforms, as well as new exploits, tools, and discoveries. The nitty gritty work of doing all of this, however, is a labor of love. For those that dig into the binary, there’s tricky copyright concerns that need to be managed, only scraps of information about old hardware and software, highly optimized/tricky code that is tough to read, and almost no financial gain – except for commercial re-releases.

Made Up of Wires walks us through a live bit of decompiling of the PS1 classic: Castlevania: Symphony of the Night to give you a taste of the work involved in this kind of work. Not really that different than any other reverse engineering but surprisingly accessible as these old games were relatively small and simple.

Make your own video card!

Make your own video card!

Ben Eater decided to build his own VGA video card. Well, technically it’s more of a display adapter/controller since the card doesn’t provide any rendering or accelerate the image buffer generation portion – but it’s still a pretty fun watch.

This is pretty much how computer graphics started. Someone built a display controller. Then others added some helper hardware to speed up the buffer fills, then blitting, then rendering, AI upscaling/noise reduction, and now full on AI rendering. What a wild technology ride – but it was this early stuff that really got me excited about technology. You could create and build all of this kind of amazing stuff yourself.

Programming for the Larrabee/Xeon Phi

Programming for the Larrabee/Xeon Phi

Back in the day, I worked on this little project called Larrabee – which later turned into the Intel Xeon Phi coprocessor. It was an ambitious and exciting platform. It consisted of a ton of 512 bit wide instructions to operate like a lot of streaming GPU architectures, yet was fully general purpose x86.

It turned out that getting performance out of this hardware was difficult. In order to get the full potential of the hardware, you simply had to utilize the vector units. Without that, it is like writing a single threaded app on a 8 core system. Single SIMD lane operation just wasn’t going to cut it as was written about in 2017 International Journal of Parallel Programming article:

“Our results show that, although the Xeon Phi delivers a relatively good speedup in comparison with a shared-memory architecture in terms of scalability, the relatively low computing power of its computational units when specific vectorization and SIMD instructions are not fully exploited makes this first generation of Xeon Phi architectures not competitive”

Using the Xeon Phi Platform to Run Speculatively Parallelized Codes

The paper, and the host of others linked on the page as references, are a good read and gives some hints why fixed-function GPUs have an advantage when it comes to raw streaming throughput. Hint: cache and data flow behavior is as, if not more, important as utilizing vectorization in such architectures.

Utilizing YouTube for infinite storage

Utilizing YouTube for infinite storage

Cloud storage is increasingly becoming less free. You can’t go long before your iPhone or Google account notifies you that you’re almost full or already full – and give you a link to a handy-dandy subscription. But there is one place where you can upload all you want and the storage is still free – YouTube.

Adam Conway wrote up a fun little program that does exactly that. He creates video frames full of data and uploads them to YouTube. He tried QR codes, but YouTube compression artifacts made that untenable. Instead, he went brute force and each 1 or 0 was a 5×5 block of pixels set to the same color. At 1920×1080, that generates about 10KB of storage per frame.

He fired it up and gave it a whirl. It worked! He even posted the code on github. It’s definitely too slow and uses a tremendous amount of storage. To use for any meaningful data as you need to take the input file and encode each bit into a 5×5 pixel in an image, then encode the images together into a video file.

Still, it’s the one free place on the internet.

Article:

Connecting an external GPU

Connecting an external GPU

Do you want to do AI work but have have a laptop, NUC, or other tiny form factor computer that cannot accept a gigantic GPU? Does your system have an Oculink port? Then maybe one of this external GPU doc is for you.

Minisforum DEG1 eGPU Dock allows you to plug in an external GPU to your small form factor PC. The only trick is that you’ll need an Oculink port. A number of small form factor PC’s now come with Oculink (like this AtomMan X7 Ti).

OCuLink is short for “Optical-Copper Link” that allows you to connect PCIe devices using an external cable rather than an internal slot. OCuLink has been around in the server world for about a decade, but starting in 2024 has becoming increasingly present on tiny form factor pc’s like the Intel’s NUCs. OCuLink is gaining popularity because it’s cheaper than complex solutions like Thunderbolt and offers almost direct PCI speeds. OCuLink is virtually an extension of your device’s PCIe slot, boasting a bandwidth of up to 16 GB/s which is much faster than Thunderbolt 4 which caps out at 5 GB/s.

You can also buy desktop PC versions of Oculink (like this one) to try things out. They’re kind of unique because they come with 2 components. A shim M.2 card to plug into your PCIe slots, and then it connects via Oculink to a small connector board that your graphics card plugs into:

Here’s a review of the setup and performance. It’s extremely impressive. You can play Cyberpunk in 4K raytracing on a connected 4090 in Ultra at a steady 70fps. Even in overdrive it maintains a steady 50+ fps. Horizon Forbidden West at 4k Very High settings plays at a stable 80-100fps – even without framegen.

While it’s still too much of a Frankenstein approach right now that isn’t consumer friendly, but I think OCuLink has really raised the bar and is going to make Thunderbolt and USB have to really up their game.

Old versions of Long Dark

Old versions of Long Dark

The Long Dark was a great game I started playing during early access and really enjoyed. The lonely and desolate wilderness feel really worked well with the the struggle against very simple but brutal natural elements.

The game has been in development longer than some teenagers have even been alive – and has consequently changed a lot over that time. Kudos to Long Dark team for making a time capsule that lets you go back to those early drops by entering a release code in Steam.

While one should ALWAYS be cautious of trainers and save game editors (and there are some on the list that do have viruses (so it’s a good idea to scan them with a virus scanner and only run them in a virtual machine) here’s some of the older trainers for these early drops on GameCopyWorld.

Installing Black Forest Flux.1

Installing Black Forest Flux.1

Stable Diffusion really opened the world to what is possible with generative AI. Stable Diffusion 2 and 3 …well…did not go so well. For a while now, Stable Diffusion 1.5 was your best bet on locally generated AI art but it is really showing it’s age.

Now there is a new player in open source generative AI you can run locally. The developers from Stability.ai have founded Black Forest Labs and released their open source tool: Flux.1

While there are plenty of online generative AI’s like Midjourney, Adobe Firefly and others, they usually require paid or only give limited usage. What’s great about Flux.1 is that is allows completely local installation and usage.

Like many open source packages, there are free and paid versions. Their paid Pro version gives the most impressive results via their api (no purely local generation), a local dev version that can be used by developers but not for commercial use, and a free schnell version for personal use. Both the dev and shnell versions are available for local install and use.

So, lets get started with the shnell version – but the instructions are the same for dev except using 2 different model/weight files.

Instructions for installing Flux.1 on nVidia based Windows 10/11 system:

  1. Prerequisites:
    • Ensure you have python installed (I used 3.12.5)
    • Ensure you have pip installed (I used pip 24.2)
    • Ensure you have git installed and working
    • You might want to enable Windows Long Path support as python sometimes requires it for dependent packages. Be sure to reboot your system after enabling it.
    • Supported graphics card.
    • 32gb of system ram (though again, you can use the smaller model if you have less ram)
  2. Open a command prompt and make a local working root directory somewhere, I’ll use c:\depot\
  3. We’re going to follow the instructions on the ComfyUI git page.
    • Clone the ComfyUI project
C:\depot> git clone https://github.com/comfyanonymous/ComfyUI.git
  1. Install pytorch

Nvidia users should install stable pytorch using this command:

C:\depot> pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121

This is the command to install pytorch nightly instead which might have performance improvements:

C:\depot>pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu124
  1. Change directory into ComfyUI and ensure the requirements.txt file is there:
  1. Use pip to install all the ComfyUI requirements:
C:\depot\ComfyUI>pip install -r requirements.txt
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: torch in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from -r requirements.txt (line 1)) (2.4.0+cu121)
Collecting torchsde (from -r requirements.txt (line 2))
Downloading torchsde-0.2.6-py3-none-any.whl.metadata (5.3 kB)
Requirement already satisfied: torchvision in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from -r requirements.txt (line 3)) (0.19.0+cu121)
Requirement already satisfied: torchaudio in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from -r requirements.txt (line 4)) (2.4.0+cu121)
Collecting einops (from -r requirements.txt (line 5))
Downloading einops-0.8.0-py3-none-any.whl.metadata (12 kB)
Collecting transformers>=4.28.1 (from -r requirements.txt (line 6))
Downloading transformers-4.44.0-py3-none-any.whl.metadata (43 kB)
Collecting tokenizers>=0.13.3 (from -r requirements.txt (line 7))
Downloading tokenizers-0.20.0-cp312-none-win_amd64.whl.metadata (6.9 kB)
Collecting sentencepiece (from -r requirements.txt (line 8))
Downloading sentencepiece-0.2.0-cp312-cp312-win_amd64.whl.metadata (8.3 kB)
Collecting safetensors>=0.4.2 (from -r requirements.txt (line 9))
Downloading safetensors-0.4.4-cp312-none-win_amd64.whl.metadata (3.9 kB)
Collecting aiohttp (from -r requirements.txt (line 10))
Downloading aiohttp-3.10.2-cp312-cp312-win_amd64.whl.metadata (7.8 kB)
Collecting pyyaml (from -r requirements.txt (line 11))
Downloading PyYAML-6.0.2-cp312-cp312-win_amd64.whl.metadata (2.1 kB)
Requirement already satisfied: Pillow in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from -r requirements.txt (line 12)) (10.4.0)
Collecting scipy (from -r requirements.txt (line 13))
Downloading scipy-1.14.0-cp312-cp312-win_amd64.whl.metadata (60 kB)
Collecting tqdm (from -r requirements.txt (line 14))
Downloading tqdm-4.66.5-py3-none-any.whl.metadata (57 kB)
Collecting psutil (from -r requirements.txt (line 15))
Downloading psutil-6.0.0-cp37-abi3-win_amd64.whl.metadata (22 kB)
Collecting kornia>=0.7.1 (from -r requirements.txt (line 18))
Downloading kornia-0.7.3-py2.py3-none-any.whl.metadata (7.7 kB)
Collecting spandrel (from -r requirements.txt (line 19))
Downloading spandrel-0.3.4-py3-none-any.whl.metadata (14 kB)
Collecting soundfile (from -r requirements.txt (line 20))
Downloading soundfile-0.12.1-py2.py3-none-win_amd64.whl.metadata (14 kB)
Requirement already satisfied: filelock in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from torch->-r requirements.txt (line 1)) (3.15.4)
Requirement already satisfied: typing-extensions>=4.8.0 in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from torch->-r requirements.txt (line 1)) (4.12.2)
Requirement already satisfied: sympy in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from torch->-r requirements.txt (line 1)) (1.13.1)
Requirement already satisfied: networkx in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from torch->-r requirements.txt (line 1)) (3.3)
Requirement already satisfied: jinja2 in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from torch->-r requirements.txt (line 1)) (3.1.4)
Requirement already satisfied: fsspec in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from torch->-r requirements.txt (line 1)) (2024.6.1)
Requirement already satisfied: setuptools in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from torch->-r requirements.txt (line 1)) (72.1.0)
Requirement already satisfied: numpy>=1.19 in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from torchsde->-r requirements.txt (line 2)) (2.0.1)
Collecting trampoline>=0.1.2 (from torchsde->-r requirements.txt (line 2))
Downloading trampoline-0.1.2-py3-none-any.whl.metadata (10 kB)
Collecting huggingface-hub<1.0,>=0.23.2 (from transformers>=4.28.1->-r requirements.txt (line 6))
Downloading huggingface_hub-0.24.5-py3-none-any.whl.metadata (13 kB)
Collecting packaging>=20.0 (from transformers>=4.28.1->-r requirements.txt (line 6))
Downloading packaging-24.1-py3-none-any.whl.metadata (3.2 kB)
Collecting regex!=2019.12.17 (from transformers>=4.28.1->-r requirements.txt (line 6))
Downloading regex-2024.7.24-cp312-cp312-win_amd64.whl.metadata (41 kB)
Collecting requests (from transformers>=4.28.1->-r requirements.txt (line 6))
Downloading requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting tokenizers>=0.13.3 (from -r requirements.txt (line 7))
Downloading tokenizers-0.19.1-cp312-none-win_amd64.whl.metadata (6.9 kB)
Collecting aiohappyeyeballs>=2.3.0 (from aiohttp->-r requirements.txt (line 10))
Downloading aiohappyeyeballs-2.3.5-py3-none-any.whl.metadata (5.8 kB)
Collecting aiosignal>=1.1.2 (from aiohttp->-r requirements.txt (line 10))
Downloading aiosignal-1.3.1-py3-none-any.whl.metadata (4.0 kB)
Collecting attrs>=17.3.0 (from aiohttp->-r requirements.txt (line 10))
Downloading attrs-24.2.0-py3-none-any.whl.metadata (11 kB)
Collecting frozenlist>=1.1.1 (from aiohttp->-r requirements.txt (line 10))
Downloading frozenlist-1.4.1-cp312-cp312-win_amd64.whl.metadata (12 kB)
Collecting multidict<7.0,>=4.5 (from aiohttp->-r requirements.txt (line 10))
Downloading multidict-6.0.5-cp312-cp312-win_amd64.whl.metadata (4.3 kB)
Collecting yarl<2.0,>=1.0 (from aiohttp->-r requirements.txt (line 10))
Downloading yarl-1.9.4-cp312-cp312-win_amd64.whl.metadata (32 kB)
Collecting colorama (from tqdm->-r requirements.txt (line 14))
Downloading colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)
Collecting kornia-rs>=0.1.0 (from kornia>=0.7.1->-r requirements.txt (line 18))
Downloading kornia_rs-0.1.5-cp312-none-win_amd64.whl.metadata (8.9 kB)
Collecting cffi>=1.0 (from soundfile->-r requirements.txt (line 20))
Downloading cffi-1.17.0-cp312-cp312-win_amd64.whl.metadata (1.6 kB)
Collecting pycparser (from cffi>=1.0->soundfile->-r requirements.txt (line 20))
Downloading pycparser-2.22-py3-none-any.whl.metadata (943 bytes)
Collecting idna>=2.0 (from yarl<2.0,>=1.0->aiohttp->-r requirements.txt (line 10))
Downloading idna-3.7-py3-none-any.whl.metadata (9.9 kB)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from jinja2->torch->-r requirements.txt (line 1)) (2.1.5)
Collecting charset-normalizer<4,>=2 (from requests->transformers>=4.28.1->-r requirements.txt (line 6))
Downloading charset_normalizer-3.3.2-cp312-cp312-win_amd64.whl.metadata (34 kB)
Collecting urllib3<3,>=1.21.1 (from requests->transformers>=4.28.1->-r requirements.txt (line 6))
Downloading urllib3-2.2.2-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests->transformers>=4.28.1->-r requirements.txt (line 6))
Downloading certifi-2024.7.4-py3-none-any.whl.metadata (2.2 kB)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\users\matt\appdata\local\packages\pythonsoftwarefoundation.python.3.12_qbz5n2kfra8p0\localcache\local-packages\python312\site-packages (from sympy->torch->-r requirements.txt (line 1)) (1.3.0)
Downloading torchsde-0.2.6-py3-none-any.whl (61 kB)
Downloading einops-0.8.0-py3-none-any.whl (43 kB)
Downloading transformers-4.44.0-py3-none-any.whl (9.5 MB)
---------------------------------------- 9.5/9.5 MB ? eta 0:00:00
Downloading tokenizers-0.19.1-cp312-none-win_amd64.whl (2.2 MB)
---------------------------------------- 2.2/2.2 MB 3.9 MB/s eta 0:00:00
Downloading sentencepiece-0.2.0-cp312-cp312-win_amd64.whl (991 kB)
---------------------------------------- 992.0/992.0 kB 2.3 MB/s eta 0:00:00
Downloading safetensors-0.4.4-cp312-none-win_amd64.whl (286 kB)
Downloading aiohttp-3.10.2-cp312-cp312-win_amd64.whl (376 kB)
Downloading PyYAML-6.0.2-cp312-cp312-win_amd64.whl (156 kB)
Downloading scipy-1.14.0-cp312-cp312-win_amd64.whl (44.5 MB)
---------------------------------------- 44.5/44.5 MB 2.9 MB/s eta 0:00:00
Downloading tqdm-4.66.5-py3-none-any.whl (78 kB)
Downloading psutil-6.0.0-cp37-abi3-win_amd64.whl (257 kB)
Downloading kornia-0.7.3-py2.py3-none-any.whl (833 kB)
---------------------------------------- 833.3/833.3 kB 1.7 MB/s eta 0:00:00
Downloading spandrel-0.3.4-py3-none-any.whl (268 kB)
Downloading soundfile-0.12.1-py2.py3-none-win_amd64.whl (1.0 MB)
---------------------------------------- 1.0/1.0 MB 7.9 MB/s eta 0:00:00
Downloading aiohappyeyeballs-2.3.5-py3-none-any.whl (12 kB)
Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Downloading attrs-24.2.0-py3-none-any.whl (63 kB)
Downloading cffi-1.17.0-cp312-cp312-win_amd64.whl (181 kB)
Downloading frozenlist-1.4.1-cp312-cp312-win_amd64.whl (50 kB)
Downloading huggingface_hub-0.24.5-py3-none-any.whl (417 kB)
Downloading kornia_rs-0.1.5-cp312-none-win_amd64.whl (1.3 MB)
---------------------------------------- 1.3/1.3 MB 6.5 MB/s eta 0:00:00
Downloading multidict-6.0.5-cp312-cp312-win_amd64.whl (27 kB)
Downloading packaging-24.1-py3-none-any.whl (53 kB)
Downloading regex-2024.7.24-cp312-cp312-win_amd64.whl (269 kB)
Downloading trampoline-0.1.2-py3-none-any.whl (5.2 kB)
Downloading yarl-1.9.4-cp312-cp312-win_amd64.whl (76 kB)
Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Downloading requests-2.32.3-py3-none-any.whl (64 kB)
Downloading certifi-2024.7.4-py3-none-any.whl (162 kB)
Downloading charset_normalizer-3.3.2-cp312-cp312-win_amd64.whl (100 kB)
Downloading idna-3.7-py3-none-any.whl (66 kB)
Downloading urllib3-2.2.2-py3-none-any.whl (121 kB)
Downloading pycparser-2.22-py3-none-any.whl (117 kB)
Installing collected packages: trampoline, sentencepiece, urllib3, scipy, safetensors, regex, pyyaml, pycparser, psutil, packaging, multidict, kornia-rs, idna, frozenlist, einops, colorama, charset-normalizer, certifi, attrs, aiohappyeyeballs, yarl, tqdm, requests, cffi, aiosignal, torchsde, soundfile, kornia, huggingface-hub, aiohttp, tokenizers, spandrel, transformers
WARNING: The script normalizer.exe is installed in 'C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The script tqdm.exe is installed in 'C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The script huggingface-cli.exe is installed in 'C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\transformers\models\deprecated\trajectory_transformer\convert_trajectory_transformer_original_pytorch_checkpoint_to_pytorch.py'
HINT: This error might have occurred since this system does not have Windows Long Path support enabled. You can find information on how to enable this at https://pip.pypa.io/warnings/enable-long-paths

c:\depot\ComfyUI>
  1. Download and install the model data files in the correct folders

After you have ComfyUI downloaded, you need to get the model files and put them in the right places. Model files are found here and are downloaded and put inside the proper comfyUI\models\ subfolders.

You have a few options. First, you need to pick if you’re using the non-commercial Dev version or Schnell version. After that, each has the option of a single easy to use checkpoint package file, or each of the model data files individually. I’ll be using the Schnell ones, but you just need to get the Dev ones from the Dev branch if you want those instead.

If you’re running out of memory, you can replace the \clip\t5xxl_fp16.safetensors with t5xxl_fp8_e4m3fn.safetensors located here.

Schnell checkpoint file:

FileDownload linkCopy location
flux1-dev-fp8.safetensorshttps://huggingface.co/Comfy-Org/flux1-dev/blob/main/flux1-dev-fp8.safetensorsComfyUI\models\checkpoints

Schnell individual files:

FileDownload linkCopy location
t5xxl_fp16.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/tree/mainComfyUI\models\clip\
ae.safetensors https://huggingface.co/black-forest-labs/FLUX.1-schnell/blob/main/ae.safetensorsComfyUI\models\vae\
flux1-schnell.safetensorshttps://huggingface.co/black-forest-labs/FLUX.1-schnell/blob/main/flux1-schnell.safetensorsComfyUI\models\unet\
  1. Start up the engine by running python on main.py
C:\depot\ComfyUI>python main.py

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "C:\depot\ComfyUI\main.py", line 83, in <module>
    import comfy.utils
  File "C:\depot\ComfyUI\comfy\utils.py", line 20, in <module>
    import torch
  File "C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\torch\__init__.py", line 2120, in <module>
    from torch._higher_order_ops import cond
  File "C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\torch\_higher_order_ops\__init__.py", line 1, in <module>
    from .cond import cond
  File "C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\torch\_higher_order_ops\cond.py", line 5, in <module>
    import torch._subclasses.functional_tensor
  File "C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\torch\_subclasses\functional_tensor.py", line 42, in <module>
    class FunctionalTensor(torch.Tensor):
  File "C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\torch\_subclasses\functional_tensor.py", line 258, in FunctionalTensor
    cpu = _conversion_method_template(device=torch.device("cpu"))
C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\torch\_subclasses\functional_tensor.py:258: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\utils\tensor_numpy.cpp:84.)
  cpu = _conversion_method_template(device=torch.device("cpu"))
Total VRAM 24576 MB, total RAM 32492 MB
pytorch version: 2.4.0+cu121
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 3090 : cudaMallocAsync
Using pytorch cross attention
C:\depot\ComfyUI\comfy\extra_samplers\uni_pc.py:19: SyntaxWarning: invalid escape sequence '\h'
  """Create a wrapper class for the forward SDE (VP type).
****** User settings have been changed to be stored on the server instead of browser storage. ******
****** For multi-user setups add the --multi-user CLI argument to enable multiple user profiles. ******
[Prompt Server] web root: C:\depot\ComfyUI\web
C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\kornia\feature\lightglue.py:44: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
  @torch.cuda.amp.custom_fwd(cast_inputs=torch.float32)

Import times for custom nodes:
   0.0 seconds: C:\depot\ComfyUI\custom_nodes\websocket_image_save.py

Starting server

To see the GUI go to: http://127.0.0.1:8188
  1. Open your web browser and go to http://127.0.01:8188
  1. Click on the ‘Queue Prompt’ button to execute the current prompt

Technically it queues up the work and you should see progress in the command window where you launched python main.py

got prompt
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLOW
Using pytorch attention in VAE
Using pytorch attention in VAE
Model doesn't have a device attribute.
C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
Model doesn't have a device attribute.
loaded straight to GPU
Requested to load Flux
Loading 1 new model
Requested to load FluxClipModel_
Loading 1 new model
C:\depot\ComfyUI\comfy\ldm\modules\attention.py:407: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
  out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00,  1.18s/it]
Requested to load AutoencodingEngine
Loading 1 new model
Prompt executed in 23.65 seconds
  1. When it completes you should see your image. You can then save your image or tweak the parameters.

Debugging help:

  1. numpy is not available

My first runs, I got this from the console when I queued up a request:

got prompt
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLOW
Using pytorch attention in VAE
Using pytorch attention in VAE
Model doesn't have a device attribute.
C:\Users\matt\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
Model doesn't have a device attribute.
loaded straight to GPU
Requested to load Flux
Loading 1 new model
Requested to load FluxClipModel_
Loading 1 new model
C:\depot\ComfyUI\comfy\ldm\modules\attention.py:407: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
  out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00,  1.19s/it]
Requested to load AutoencodingEngine
Loading 1 new model
!!! Exception during processing!!! Numpy is not available
Traceback (most recent call last):
  File "C:\depot\ComfyUI\execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\depot\ComfyUI\execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\depot\ComfyUI\execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\depot\ComfyUI\nodes.py", line 1445, in save_images
    i = 255. * image.cpu().numpy()
               ^^^^^^^^^^^^^^^^^^^
RuntimeError: Numpy is not available

Prompt executed in 26.44 seconds

It turns out that I, and others, have the wrong version of numpy. This fixed it by exiting out of the server (ctrl-c) and then installing numpy verison 1.26.4:

C:\depot\ComfyUI>pip install numpy==1.26.4
Defaulting to user installation because normal site-packages is not writeable
Collecting numpy==1.26.4
  Downloading numpy-1.26.4-cp312-cp312-win_amd64.whl.metadata (61 kB)
Downloading numpy-1.26.4-cp312-cp312-win_amd64.whl (15.5 MB)
   ---------------------------------------- 15.5/15.5 MB 57.4 MB/s eta 0:00:00
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 2.0.1
    Uninstalling numpy-2.0.1:
      Successfully uninstalled numpy-2.0.1
Successfully installed numpy-1.26.4

C:\depot\ComfyUI>

Uninstalling all pip/python package, clear your pip cache, then re-install the requirements

The first time I installed, I got an error when downloading the numpy library during step in which you pip install the requirements. In order to clear the pip cache, uninstall all pip packages, then re-install all requirements again, I did the following:

C:\depot\ComfyUI> pip uninstall -r requirements.txt -y 
C:\depot\ComfyUI> python -m pip cache purge

Then I re-ran all the pip installation commands.

Links:

Other generative AI installation guides:

I have previous posted instructions on how to install Stable Diffusion 2 (as well as Stable Diffusion 1.5 and 1.4) as well as some other package installs.

Sound cards for a retro PC build

Sound cards for a retro PC build

I was recently making my own retro 486 DX 66 PC build and needed to add an ISA sound card that supported both DOS and Windows games. A genuine Sound Blaster card would definitely work, but buying an genuine Sound Blaster Pro will run you well over $150+ (over $200 with it’s box)

In googling around, I found this great thread on Vogons where someone asked the same question: Is there a cheaper alternative than finding a Sound Blaster/Sound Blaster Pro? It turns out there is – the really excellent ESS AudioDrive ES1868.

I had not heard of the ESS AudioDrive ES1868 ISA sound card before, but it is considered one of the best Sound Blaster clone cards. It has tons of features such as Sound Blaster Pro 2 compatibility (something even the Sound Blaster 16 doesn’t have!). It is extremely easy to set up for DOS and Windows, has mixer inputs for line-in, microphone, CD input, wavetable, and is a really quiet card (as opposed to Sound Blaster 16’s that suffered from chronic hum and pop issues to the point it was often called the ‘NoiseBlaster’). The drivers are easy to set up and even support non-PnP configuration. It makes the card work with 99% of DOS games. Even better, the cards are readily available for around $25-$30.

I bought a card for $25 off eBay and installed it without issue. The ESS drivers are available on Phil’s Computer Lab link (below). I download the drivers, ran the installer, and set the parameters during install to the same as a default Sound Blaster card: A220 I7 D1 H5 P330 T6
Address: 220h
IRQ: 7
DMA: 1
Port: 330h
Type: 6

I then popped up my copy of Wolfenstein 3D, chose the Sound Blaster output option with default parameters and got all the awesome audio of yesteryear.

Learning everything there is to know about the different Sound Blaster and clone sound cards:

DOS Days has really excellent write-ups on all the various Sound Blaster cards with pros and cons of each. I’m really glad I read up on the different models before buying a generic Sound Blaster 16. There’s a tremendous wealth of information about issues unique to each card. Definitely a site worth reading before buying a card from eBay.

They also have an exhaustive list of all kinds of other sound cards which includes info on the ESS Audiodrive cards. There’s a ton of great information about the different models and where they fit in the sound card landscape. A definite must read.

Links: