Every character is important in AI, including the spaces between words and punctuation. “Womens” is not a word in English. Women is already the plural form of woman. There must be 's to denote the possessive ownership.
In generative AI, the tools to monitor the tokenized model input are more challenging to view as these tools are not integrated into Automatic1111 or ComfyUI by default like how the feature is integrated into Oobabooga Textgen for LLM’s. Monitoring the tokenized input for the model would show how the word was either omitted entirely or was broken into the simplified single letters, or at least that is how LLM’s do tokenization.
You should always keep in mind that every word and style you use in a prompt, must correlate with tags that were trained with the image. Many models are trained with natural language sentences, so they have some degree of natural language processing. It is not complex in the same natural language processing as a text to text model where there are complex special tokens that connect the input to the output.
The way tokens are processed is a major aspect of the evolution of generative AI. For instance, the first stable diffusion 1.x models use CLIP G, which is a very small language processing model. The SDXL models use a dual processing setup with CLIP G and CLIP L used in tandem. The last Stable Diffusion model, SD3, uses a triple processing setup that uses G, L, along with a full T5xxl text to text large language model. I haven’t gone super in depth trying to understand the codebase from SD3, but there is something weird happening with the T5 where SD3 is swapping an entire tensor layer each time the model loads instead of shipping a pretrained model or using a LoRA layering scheme. Safety with generative AI is different from LLM’s. It is not part of the model in the same way that safety works for a LLM. I found it fascinating how SD3 omits human genitalia and started looking into the code for ComfyUI as a result because this behavior is deterministic and therefore not part of the actual tensor tables maths. The behavior centers around the T5 model… Anyways, I’m getting stupid technical on a tangent… What I meant to say is that the text processing and tokenization of the model is external to the tensor tables of the actual generative model. If the processing scheme is complex enough, it might be possible to error correct the prompt, but it is best to assume that the prompt will be exactly as it was submitted.
Have you setup a rules file for USB? You must have a udev rule setup that gives your user access to the hardware. It is trivial to create, but is one of those little headaches you learn as you go. Sparkfun and Adafruit should both have good tutorials if you search either of them for udev rules.
Mine for a ch340 is done like this:
$ cd /etc/udev/rules.d $ sudo nano 69-my-usb-serial-devices.rules # ch340 SUBSYSTEM="USB", ENV{DEVTYPE}=="usb_device", ATTR{idVendor}=="1a86", ATTR{idProduct}=="7523", MODE="0666"
I just told you to enter the terminal editor nano and enter a note that will help you remember that this is for the ch340
# ch340
followed by a line that sets the permissions for the device using a rule for which users have access to the device. I’m assigning the rule based on the vendor and product ID numbers. You can find these numbers by using the$ lsusb
command. FYI, the $ is standard shorthand for command line as your standard user. This is opposed to # which is short for the root user at the command line.Once you enter this line in nano, follow the instructions to save the file in nano
:qw
IIRC. The next time you plug in the device, the kernel should use this rule to set the permissions for the device to 0666 which means everyone can read write, but not execute stuff from the port; with execute would be 0777.When you are trying to find info about a USB device the following may be helpful:
$ sudo dmesg | grep -F "USB device number"
Note that the last line should be the most recently connected device.
$ dmesg
is the system-d boot log. Depending on how system-d is configured, you’ll probably see timestamps on the left. The initial bootup devices will show up with a tightly grouped time stamp, while later connections will show a much larger number.There have been some recent changes in Fedora that have broken a script I wrote to help me with all the various places where USB hardware is located and finding the right info. I’m trying to parse that script for the key elements. The first step is to find the location of the hardware. You are looking for something like
/dev/bus/usb/003/003
or wherever the new device got mounted. This is only the start, because different parts of the device may be mounted in different locations. I’m not just talking about the CH340, but like, if you are doing microcontrollers stuff that gets more complicated like forth, micropython and circuit python where there will be more going on than just the serial port, or you need to know low level stuff. Once you know the specific port, you can use$ udevadm info --attribute-walk --path=$(udevadm info --query=path --name=/dev/bus/usb/003/003) # enter the port for the device in question
.In the past, my script used
$ dmesg
to retrieve the device location, then used$ lsusb -D *device location*
to get the basic info. Then I went a layer deeper with the udevadm command to see everything related to the device. The command$ fdisk -l
might also help with some STM32 type stuff that has a dfu bootloader and identifies as a USB drive when plugged in… At least, I think that was the reason I kept that option in my script, it has been awhile since I used one of those.Edit: I can get the actual port location of a device now using
$ lsusb -t -vv
.