sudovi.com

Author Archive

Wrapping non-ascii characters in Django

by sudovi on Sep.17, 2011, under Django

One of our long time clients recently hired a very talented design firm to give their site — and brand — a facelift. During said facelift a very strange font was selected that we had to serve up via embedded fonts. It just so happened however that this font for some reason had very large registered trademark symbols. So large in fact that our client mandated that we “find a way” to fix it.

After some consideration, I decided the best way to approach such an issue was to create a custom filter that we could use wherever we were expecting the database to hand us one of these symbols. (Since our database is using UTF-8, we’re able to literally store those symbols directly in the db — or at least that’s how I understand it, and is pretty evident when you perform a select on one of their products).

For the impatient, what I ended up with:

def wrap_symbol(value, classname):
    """
    Looks for registered trademark symbols and wraps them in
    a css class to allow for styling.

    Usage:
    {{ string|wrap_symbol:'classname' }}

    TODO: allow specification of symbol(s) to replace
    """
classname = smart_str(classname)
repl_text = "<span class='" + classname + "'>\xc2\xae</span>"

try:
    string = smart_str(value).replace('\xc2\xae', smart_str(repl_text))
except ValueError, exp:
    return value
return mark_safe(string)

But my first attempt had the replacement taking place as such:

string = value.replace("\xc2\xae\", "<span class='" + classname + "'>\xc2\xae</span>")

I was repeatedly met with the following error: ‘ascii’ codec can’t encode character 0xca2 in position 0: ordinal not in range(128). While I didn’t expect it to work out of the gate — as things rarely tend to do for me — I was in no way expecting the multi-hour battle that would ensue, (largely due to my lack of understanding of character encodings and some unexpected output from print statements, but I digress) — off to the shell for some experimentation:


>>> from products.models import Product
>>> p = Product.objects.get(pk=25)
>>> p.name
CompanyName\xae ProductName\xae
>>> print p.name
CompanyName® ProductName #(I'm paraphrasing here)
>>> print unicode(p.name)
CompanyName® ProductName
>>> unicode(p.name)
CompanyName\xae ProductName\xae

Had I only noticed the pattern that emerged during the above commands, I mightn’t be writing this post…nevertheless, continuing, I knew that the DB should be giving me back the UTF-8 representation so I hit the great google for guidance. Through various articles and even a trip to #python and #django, I ended up trying something like:

>>> p.name.encode('utf-8')
CompanyName\xc2\xae ProductName\xc2\xae

Aha! Something different — if only slightly — but different enough to send me down another black hole. After many a google search, I came across this which ultimate led me to my final working destination.

Sadly it wasn’t until after I had things working that I realized what was holding me up in my testing. Anytime I called “print” from the shell or from my code, etc — print was converting the output for me. (It was doing it right in front of my eyes, but apparently I refused to believe it.) Investigating using python’s “type()” method, entering type(p.name) definitely returned ‘unicode’ so for the life of me I couldn’t figure out why my debug statements kept having the actual symbol print out, which was further muddying my already cloudy understanding of the issue.

After one final hiccup involving the insertion actual replacement text: \xc2\xae back into the string, I had a working filter that I can now use across my entire site. I’m not entirely sure that I still grasp what exactly what was/is going on, but the of django’s utilities make it so much easier to deal with this sort of thing. Man these guys thought of _everything_.

2 Comments more...

I’m Embarassed

by sudovi on Aug.19, 2011, under Bad Code, Challenge

I’m embarrassed!  Every now and then — okay, quite frequently — I’m reminded that when I started my I.T. career I had no interest in being a developer.  It wasn’t until I got sick of getting certified in this technology or the other, and being married to a pager that I decided I wanted to learn how to be a developer.  I had no official training and am pretty much “self taught” — and at times, I feel that it shows through.

For example, I have a large client that has a huge email subscriber list that they send monthly emails to.  (Yes, they’re opted in so don’t go all *spammer* on me).  Occasionally they like to do targeted blasts based on zipcode (for those members we actually have a zipcode for).  This past month’s blas had over 28,000 zipcodes in the target list.  Due to an oversight, I wound up with a list of zipcodes that overlapped.  Basically it came down to having two files, one of 132277 lines and another with 113035 lines.  In these two files were approximately 30,000 or so overlapping email addresses that would have received both the targeted and non-targeted blasts had I not caught it.  *OUCH* that would NOT have been good.

I decided to parse through the two files with python since its syntax is fresh in my mind, and I’d have wound up googling too much had I decided to do it in bash and this was time-sensitive stuff.  So I busted out vi and coded up the following code (don’t laugh):

INFILE1 = 'all-email.lst'
INFILE2 = 'nofp-list.txt'

destinations = []
dupes = []

for line in open(INFILE1, 'r').readlines():
    for line2 in open(INFILE2, 'r').readlines():
        if line != line2:
            print line2
            destinations.append(line2)
        else:
            dupes.append(line2)

This code subsequently hung my machine as it struggled to loop over so much. I knew this wasn’t gonna be the final version as I was writing it — I had to get my creative juices flowing first — but really didn’t expect it to hang my machine. I had to power off my machine and then revise the code once my system came back up. After some initial tweaks I had this (thinking that it was just too much to read all that into memory, and failing to see the real problem for what it was –that nested for loop):

import fileinput

INFILE1 = 'all-email.lst'
INFILE2 = 'nofp-list.txt'

destinations = []
dupes = []

for line in fileinput.input([INFILE1]):
    for line2 in fileinput.input([INFILE2]):
        if line != line2:
            print line2
            destinations.append(line2)
        else:
            dupes.append(line2)

This didn’t work either as it was giving me an input already open error. Rather than investigate further, I turned around and ultimately ended up with this, and thought to myself “that was dumb I KNOW better than that”:

INFILE1 = 'all-email.lst'
INFILE2 = 'nofp-list.txt'

destinations = []
dupes = []

list1 = open(INFILE1, 'r').readlines()
list2 = open(INFILE2, 'r').readlines()

for line in list2:
    if line in list1:
        dupes.append(line)
    else:
        print line
        destinations.append(line)

*DUH* read them both into a list and use python’s “in” syntax to look for one in the other. Done. Now, I’m positive there’s even more iterations of this code that it could have eventually evolved into but this got the job done and didn’t suck my machine’s resources. Since it was a one-off — I stopped here. (NOTE: the code might not be exactly as I had it since this is largely from memory.)

In any case — its situations like these that I both despise and enjoy at the same time. I despise it because it should be easy and am embarrassed by the fact that I my first revision was so *dumb* as if it lacked any thought. But I enjoy them because its a finite problem to solve and allows me to exercise my brain a bit. Being someone who is starting to spend less time coding and more time in meetings, it feels good to do these exercises.

CLEARLY I’ve uncovered the need for me to code up a better process for doing these targeted blasts as it would appear they are going to be doing more and more of them. I look forward to writing that code so I don’t have to go through something like this that should have been so very elementary! * Embarrassing!*

I invite you share your approach for such a situation, A) so I can learn more from it and B) to find out if I’m really that far off anyway…

:wq!

2 Comments more...

It’s the small stuff…

by sudovi on Jun.03, 2011, under Challenge, Coding Methods, General

 

I know, I know! It’s been a very long time — I tell myself its because I only speak when I have something to say.  Whatever gets you by right?

Anyway, over the past few weeks (very much off and on, more off than on) I’ve been working on an algorithm for building a schedule of teams for a league manager.  What started out as something that I thought should have been very simple turned out to be much more complicated that I’d originally thought.  I remember thinking “2 hours tops” — ever have one of those moments when you realize that you clearly weren’t thinking when you opened your big fat mouth?

Sure, some of you could probably write this in your sleep, but I did some googling, asked around a bit etc, and as it turns out, its not as easy as you might think.  (Just google for league scheduling algorithm — better yet – let me google that for you).  I think you’ll find that there are more lines of code than one would think and a number of people attempting to roll their own that have come running for help.

I found several good examples, but with many of them, the code wasn’t very reader-friendly.  I found myself down so many dead ends and just about the time I thought “this is it, I’ve got it” I’d fall flat on my face again.

I walked away.  (I did it on paper which turned out to pretty easy and how I came up with an algorithm that I was (and still am) sure I could use).  However it bothered the living daylights out of me that I couldn’t solve the problem. I eventually left my afore mentioned algorithm behind and though I believe it’d be way more efficient and someday plan to figure out how to get it working, in the interest of getting something done — one of my mantras — I moved forward with an approach that I was making progress with.  At long last, I had a schedule builder that I can use for the bocce league that I’ve been made captain of recently.

For those of you looking for code — I’m not going to show it for numerous reasons:

  1. Part of me still believes this should have been much simpler and frankly I’m embarrassed
  2. This sounds like it could be someone’s homework assignment and I’m not about to be giving any answers
  3. I may wind up using it in some actual software I’m working on

Overall, it comes down to be remembering to go back to the basics:

  1. Start small, solve only small problems.  If you feel its a big problem, break it down into its smallest chunks and solve each chunk as its own problem.
  2. Use pseudo-code — my oh my when did I ever stop pseudo-coding things?  So very helpful to break down the problem domain.  (Note to self, is there a possible software solution here….hmmmm)
  3. Comment the living crap out of your code.  When you walk away for a bit or get interrupted, you’ll be thankful.  Many people say good code comments itself.  I say “nay nay.” When you are solving an issue you’ve never solved before and had to google a few things, you wind up writing obscure bits of code or syntax that you’ll no doubt forget about and forget what purpose it served or what functionality it provided.   Comments are still a very good thing.  Anyone telling you different is plain wrong.
  4. Don’t be afraid to start over or delete code.
  5. Don’t be afraid to write something procedurally, at least at first.  My solution was very much done procedurally.  It was not OO or functional.  After all, an algorithm is a recipe.  A recipe is very procedural. Writing something procedural to solve a problem helped me a great deal in this case and provided me with a clearer understanding of the problem at hand and its solution.

Again, this is probably a very simple problem for some of you.  For some reason, the solution eluded me for much longer that I’d like to admit.  There is still one strange kink that I have to wrap a try/except block around and thats on my list to work out, (as well as providing support for bye-weeks) but at last its refreshing to see a schedule actually being built.

I’m really curious to see if anyone else has worked on this problem, or if anyone else would like to try working it out for themselves in their language of choice and them come back and provide their insight/feedback etc.  I don’t expect you to show code since I didn’t even do that myself, but I’d love to hear your opinion on how easy/hard you found this to be.  In the end, I fully expect to be an idiot here and its something very simple that just eluded me and if thats the case, I’m fine with it — sorta.  But I’d love to find out if anyone else underestimated the issue.  So if you wanna give it a try, here are the rules I had to work with:

  1. X number of teams over N  number of weeks.
  2. In my league, each team plays each other only once.  (This was the sticking point here, we had 8 teams, they wanted an 8-week season which means there has to be a bye-week, or you just play a 7-week season.  Since they wanted 8 weeks, I have to add bye support in there.  I’m still working on that.  That being said, if you have 16 teams, you need at least a 15-week season, or 16 weeks and everyone has one bye week.  Or you can have 8 teams, play each other twice for 14 weeks with two bye’s, etc which is something else I haven’t worked out yet but plan on doing).

There are a ton of edge cases, so let’s stick with the following:  8 teams, 7-weeks, everyone plays each-other once.  Man…even saying it again makes it sound so simple but yet I found it so much more difficult that I expected.  Wasn’t the hardest code I’ve ever had to come up with by a long shot, but was much more painful that I thought it should have been.  Any takers?

 

:wq!

Leave a Comment more...

My Big Cop-Out

by sudovi on Nov.11, 2010, under General

If you’ve been here before, you’re probably wondering what happened to the old design.   While I enjoyed it, and most people “got” the hint, it wasn’t very nice looking, or very usable etc.  On top of that, it was sitting on top of a blog engine that I rolled myself using CodeIgniter.  Home-rolled blogging engines, to me anyway, are nothing more than a way to get used to a language or a framework.  It was fun, I learned some things, but one thing that annoyed the hell out of me was comment spam.  I knew I wanted to implement Akismet but didn’t have that kind of time and to be honest, I was getting tired of maintaining the codebase.

Alas, I’ve given in and gone WordPress.  Hell, I’ve even nabbed a free theme that I may tweak only slightly.  Why?!? Mainly because I am lazy and a very busy father of two girls and dedicated husband, and because I’m cheap.  (I’d love to write my own blogging engine in Django, but it costs me a bit more than I can justify spending at the moment for the hosting when I already have this hosting package here.  So forgive me for copping out on you.

At some point, I’ll be adding my old entries from earlier — again there’s that whole “time” thing.  But hopefully, since WordPress makes it so easy to get my thoughts out there, this will motivate me blog more.  And with that said…here’s to hopin! ;)

:wq!

Leave a Comment :, more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Archives

All entries, chronologically...